Sciweavers

722 search results - page 52 / 145
» Data Cleaning: Problems and Current Approaches
Sort
View
EMNLP
2009
13 years 6 months ago
Improved Statistical Machine Translation Using Monolingually-Derived Paraphrases
Untranslated words still constitute a major problem for Statistical Machine Translation (SMT), and current SMT systems are limited by the quantity of parallel training texts. Augm...
Yuval Marton, Chris Callison-Burch, Philip Resnik
IJDE
2007
77views more  IJDE 2007»
13 years 8 months ago
Session Based Packet Marking and Auditing for Network Forensics
The widely acknowledged problem of reliably identifying the origin of network data has been the subject of many research works. Due to the nature of Internet Protocol, a source IP...
Omer Demir, Ping Ji, Jinwoo Kim
HPDC
2010
IEEE
13 years 9 months ago
Data parallelism in bioinformatics workflows using Hydra
Large scale bioinformatics experiments are usually composed by a set of data flows generated by a chain of activities (programs or services) that may be modeled as scientific work...
Fábio Coutinho, Eduardo S. Ogasawara, Danie...
PVLDB
2010
123views more  PVLDB 2010»
13 years 7 months ago
Toward Scalable Keyword Search over Relational Data
Keyword search (KWS) over relational databases has recently received significant attention. Many solutions and many prototypes have been developed. This task requires addressing ...
Akanksha Baid, Ian Rae, Jiexing Li, AnHai Doan, Je...
BMCBI
2007
144views more  BMCBI 2007»
13 years 8 months ago
Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data
Background: In practice many biological time series measurements, including gene microarrays, are conducted at time points that seem to be interesting in the biologist's opin...
Miika Ahdesmäki, Harri Lähdesmäki, ...