Tom D. Breeze, Bernard E. Vaissière, Riccardo Bommarco, Theodora Petanidou, Nicos Seraphides, Lajos Kozák, Jeroen Scheper, Jacobus C. Biesmeijer, David Kleijn, Steen Gyldenkærne, Marco Moretti, Andrea Holzschuh, Ingolf Steffan-Dewenter, Jane C. Stout, Meelis Pärtel, Martin Zobel, Simon G. Potts, Gen Hua Yue
Declines in insect pollinators across Europe have raised concerns about the supply of pollination services to agriculture. Simultaneously, EU agricultural and biofuel policies have encouraged substantial growth in the cultivated area of insect pollinated crops across the continent. Using data from 41 European countries, this study demonstrates that the recommended number of honeybees required to provide crop pollination across Europe has risen 4.9 times as fast as honeybee stocks between 2005 and 2010. Consequently, honeybee stocks were insufficient to supply >90% of demands in 22 countries studied. These findings raise concerns about the capacity of many countries to cope with ...Details
PLoS Computational Biology
Joern Toedling, Wolfgang Huber, Fran Lewitter
Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data, and it offers tools that cover a broad range of computational methods, visualizations, and experimental data types, and is designed to allow the construction of scalable, reproducible, and interoperable workflows. A consequence of the wide range of functionality of Bioconductor and its concurrency with research progress in biology and computational statistics is that using its tools can be daunting for a new user. Various books provide a good general introduction to R and Bioconductor, and most Bioconductor packages are accompanied by extensive ...Details
Feng Zhao, Jinbo Xu
Although studied extensively, designing highly accurate protein energy potential is still challenging. A lot of knowledge-based statistical potentials are derived from the inverse of the Boltzmann law and consist of two major components: observed atomic interacting probability and reference state. These potentials mainly distinguish themselves in the reference state and use a similar simple counting method to estimate the observed probability, which is usually assumed to correlate with only atom types. This article takes a rather different view on the observed probability and parameterizes it by the protein sequence profile context of the atoms and the radius of the gyration, ...Details
Steven W. Kembel, James F. Meadow, Timothy K. O’Connor, Gwynne Mhuireach, Dale Northcutt, Jeff Kline, Maxwell Moriyama, G. Z. Brown, Brendan J. M. Bohannan, Jessica L. Green, Bryan A. White
Architectural design has the potential to influence the microbiology of the built environment, with implications for human health and well-being, but the impact of design on the microbial biogeography of buildings remains poorly understood. In this study we combined microbiological data with information on the function, form, and organization of spaces from a classroom and office building to understand how design choices influence the biogeography of the built environment microbiome.
Sequencing of the bacterial 16S gene from dust samples revealed that indoor bacterial communities were extremely diverse, containing more than 32,750 OTUs (operational taxonomic units, 97% sequence similarity cutoff), but ...Details
Aerosol Science and Technology
A. Prakash, A. P. Bapat, M. R. Zachariah
In this article, a simple numerical method to solve the general dynamic equation (GDE) has been described and the software made available. The model solution described is suitable for problems involving gas-to-particle conversion due to supersaturation, coagulation, and surface growth of particles via evaporation/condensation of monomers. The model is based on simplifying the sectional approach to discretizing the particle size distribution with a nodal form. The GDE developed here is an extension of the coagulation equation solution method developed by Kari Lehtinen, wherein particles exist only at nodes, as opposed to continuous bins in the sectional method. The results have ...Details
Frontiers in Environmental Microbiology
Due to intensive agriculture, rapid industrialization and anthropogenic activities have caused environmental pollution, land degradation and increased pressure on the natural resources and contributing to their adulteration.Details
Statistical Analysis and Data Mining
S. Gardner-Lubbe, N. J. le Roux, H. Maunders, V. Shah, S. Patwardhan
Although principal component analysis is widely used in the exploration of microarray data, the advantages of constructing a biplot as multivariate analog to a scatterplot is seldom exploited. This paper illustrates the benefits of using biplots with microarray data to (1) visually display both the treatments and genes of such extreme high-dimensional data in a single plot, (2) relate the treatments to the underlying biological process through the use of biplot axes, and (3) to optimally separate classes and explore the differentially associated expression in genes. In this analysis, we have used gene expression measurements from human bronchial epithelial cells ...Details
Science Translational Medicine
N. P. Tatonetti, P. P. Ye, R. Daneshjou, R. B. Altman
Adverse drug events remain a leading cause of morbidity and mortality around the world. Many adverse events are not detected during clinical trials before a drug receives approval for use in the clinic. Fortunately, as part of postmarketing surveillance, regulatory agencies and other institutions maintain large collections of adverse event reports, and these databases present an opportunity to study drug effects from patient population data. However, confounding factors such as concomitant medications, patient demographics, patient medical histories, and reasons for prescribing a drug often are uncharacterized in spontaneous reporting systems, and these omissions can limit the use of quantitative signal ...Details
J. Hippisley-Cox, C. Coupland
Derive and validate a new clinical risk prediction algorithm QThrombosis to estimate individual patients’ risk of venous thromboembolism.
The derivation cohort included 14 756 incident cases of venous thromboembolism from 10 095 199 person years of observation (rate of 14.6 per 10 000 person years). The validation cohort included 6913 incident cases from 4 632 694 person years of observation (14.9 per 10 000 person years).
We have developed and validated a new risk prediction model that quantifies absolute risk of thrombosis at 1 and 5 years. It can help identify patients at high risk of venous thromboembolism for ...Details
Cole Trapnell, David G Hendrickson, Martin Sauvageau, Loyal Goff, John L Rinn, Lior Pachter
Differential analysis of gene and transcript expression using high-throughput RNA sequencing (RNA-seq) is complicated by several sources of measurement variability and poses numerous statistical challenges. We present Cuffdiff 2, an algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries. Cuffdiff 2 robustly identifies differentially expressed transcripts and genes and reveals differential splicing and promoter-preference changes. We demonstrate the accuracy of our approach through differential analysis of lung fibroblasts in response to loss of the developmental transcription factor HOXA1, which we show is required for lung fibroblast and HeLa cell cycle progression. Loss of HOXA1 ...Details
Journal of Clinical Oncology
C. Capalbo, A. Buffone, A. Vestri, E. Ricevuto, C. Rinaldi, M. Zani, S. Ferraro, L. Frati, I. Screpanti, A. Gulino, G. Giannini
BRCAPRO is a statistical model, with associated software, for assessing the probability that an individual carries a germline deleterious mutation of the BRCA1 and BRCA2 genes, based on family history of breast and ovarian cancer, based on his or her family's history of breast and ovarian cancer, including male breast cancer and bilateral synchronous and asynchronous diagnoses. BRCAPRO uses a Mendelian approach that assumes autosomal dominant inheritance. This assumption is supported extensively by previous linkage analyses. Age-dependent penetrances and prevalences are based on a systematic review of the literature. BRCAPRO was originally developed as part of the Duke SPORE in ...Details
The Journal of Neuroscience
L. Q. Uddin, K. S. Supekar, S. Ryali, V. Menon
Using functional and effective connectivity measures applied to fMRI data, we examine interactions within and between the SN, CEN, and DMN. We find that functional coupling between key network nodes is stronger in adults than in children, as are causal links emanating from the rFIC. Specifically, the causal influence of the rFIC on nodes of the SN and CEN was significantly greater in adults compared with children. Notably, these results were entirely replicated on an independent dataset of matched children and adults. Developmental changes in functional and effective connectivity were related to structural connectivity along these links. Diffusion tensor imaging ...Details
Nicholas G. Reich, Jessica A. Myers, Daniel Obeng, Aaron M. Milstone, Trish M. Perl, Sten H. Vermund
In recent years, the number of studies using a cluster-randomized design has grown dramatically. In addition, the cluster-randomized crossover design has been touted as a methodological advance that can increase efficiency of cluster-randomized studies in certain situations. While the cluster-randomized crossover trial has become a popular tool, standards of design, analysis, reporting and implementation have not been established for this emergent design. We address one particular aspect of cluster-randomized and cluster-randomized crossover trial design: estimating statistical power. We present a general framework for estimating power via simulation in cluster-randomized studies with or without one or more crossover periods. We have ...Details
Socio-Economic Planning Sciences
Paul W. Wilson
A software package for computing non-parametric efficiency estimates, making inference, and testing hypotheses in frontier models. Commands are provided for bootstrapping as well as computation of some new, robust estimators of efficiency, etc.
FEAR consists of a software library that can be linked to the general-purpose statistical package R. The routines included in FEAR allow the user to compute DEA estimates of technical, allocative, and overall efficiency while assuming either variable, non-increasing, or constant returns to scale. The routines are highly flexible, allowing measurement of efficiency of one group of observations relative to a technology defined by a second, reference ...Details
The Astrophysical Journal
We describe Ganalyzer, a model-based tool that can automatically analyze and classify galaxy images. Ganalyzer works by separating the galaxy pixels from the background pixels, finding the center and radius of the galaxy, generating the radial intensity plot, and then computing the slopes of the peaks detected in the radial intensity plot to measure the spirality of the galaxy and determine its morphological class. Unlike algorithms that are based on machine learning, Ganalyzer is based on measuring the spirality of the galaxy, a task that is difficult to perform manually, and in many cases can provide a more accurate analysis ...Details
Minoli A Perera, Larisa H Cavallari, Nita A Limdi, Eric R Gamazon, Anuar Konkashbaev, Roxana Daneshjou, Anna Pluzhnikov, Dana C Crawford, Jelai Wang, Nianjun Liu, Nicholas Tatonetti, Stephane Bourgeois, Harumi Takahashi, Yukiko Bradford, Benjamin M Burkley, Robert J Desnick, Jonathan L Halperin, Sherief I Khalifa, Taimour Y Langaee, Steven A Lubitz, Edith A Nutescu, Matthew Oetjens, Mohamed H Shahin, Shitalben R Patel, Hersh Sagreiya, Matthew Tector, Karen E Weck, Mark J Rieder, Stuart A Scott, Alan HB Wu, James K Burmester, Mia Wadelius, Panos Deloukas, Michael J Wagner, Taisei Mushiroda, Michiaki Kubo, Dan M Roden, Nancy J Cox, Russ B ...
VKORC1 and CYP2C9 are important contributors to warfarin dose variability, but explain less variability for individuals of African descent than for those of European or Asian descent. We aimed to identify additional variants contributing to warfarin dose requirements in African Americans. We did a genome-wide association study of discovery and replication cohorts. Samples from African-American adults (aged ≥18 years) who were taking a stable maintenance dose of warfarin were obtained at International Warfarin Pharmacogenetics Consortium (IWPC) sites and the University of Alabama at Birmingham (Birmingham, AL, USA). Patients enrolled at IWPC sites but who were not used for discovery made ...Details
A. Roberts, H. Pimentel, C. Trapnell, L. Pachter
We describe a new “reference annotation based transcript assembly” problem for RNA-Seq data that involves assembling novel transcripts in the context of an existing annotation. This problem arises in the analysis of expression in model organisms, where it is desirable to leverage existing annotations for discovering novel transcripts. We present an algorithm for reference annotation based transcript assembly and show how it can be used to rapidly investigate novel transcripts revealed by RNA-Seq in comparison with a reference annotation.
Supplementary Information: The assemblies compared in the Results section are provided along with the publication at the journal’s website.
Jianxing Feng, Tao Liu, Bo Qin, Yong Zhang, Xiaole Shirley Liu
Model-based analysis of ChIP-seq (MACS) is a computational algorithm that identifies genome-wide locations of transcription/chromatin factor binding or histone modification from ChIP-seq data. MACS consists of four steps: removing redundant reads, adjusting read position, calculating peak enrichment and estimating the empirical false discovery rate (FDR). In this protocol, we provide a detailed demonstration of how to install MACS and how to use it to analyze three common types of ChIP-seq data sets with different characteristics: the sequence-specific transcription factor FoxA1, the histone modification mark H3K4me3 with sharp enrichment and the H3K36me3 mark with broad enrichment. We also explain how to ...Details
Adam Roberts, Cole Trapnell, Julie Donaghey, John L Rinn, Lior Pachter
The biochemistry of RNA-Seq library preparation results in cDNA fragments that are not uniformly distributed within the transcripts they represent. This non-uniformity must be accounted for when estimating expression levels, and we show how to perform the needed corrections using a likelihood based approach. We find improvements in expression estimates as measured by correlation with independently performed qRT-PCR and show that correction of bias leads to improved replicability of results across libraries and sequencing technologies.Details
Mark Dingemanse, Francisco Torreira, N. J. Enfield, Johan J. Bolhuis
A word like Huh?–used as a repair initiator when, for example, one has not clearly heard what someone just said– is found in roughly the same form and function in spoken languages across the globe. We investigate it in naturally occurring conversations in ten languages and present evidence and arguments for two distinct claims: that Huh? is universal, and that it is a word. In support of the first, we show that the similarities in form and function of this interjection across languages are much greater than expected by chance. In support of the second claim we show that it ...Details
Conference Proceedings. 2nd International IEEE EMBS Conference on Neural Engineering, 2005.
D. Wagenaar, T.B. DeMarse, S.M. Potter
We present a software suite, MeaBench, for data acquisition and online analysis of multi-electrode recordings, especially from micro-electrode arrays. Besides controlling data acquisition hardware, MeaBench includes algorithms for real-time stimulation artifact suppression and spike detection, as well as programs for online display of voltage traces from 60 electrodes and continuously updated spike raster plots. MeaBench features real-time output streaming, allowing easy integration with stimulator systems. We have been able to generate stimulation sequences in response to live neuronal activity with less than 20 ms lag time. MeaBench is open-source software, and is available for free public download at http://www.its.caltech.edu/~pinelab/wagenaar/meabench.htmlDetails
Journal of Clinical Oncology
C. C. Jaffe
RECIST (Response Evaluation Criteria in Solid Tumors) is a widely employed method introduced in 2000 to assess change in tumor size in response to therapy. The simplicity of the technique, however, contrasts sharply with the increasing sophistication of imaging instrumentation. Anatomically based imaging measurement, although supportive of drug development and key to some accelerated drug approvals, is being pressed to improve its methodologic robustness, particularly in the light of more functionally-based imaging that is sensitive to tissue molecular response such as fluorodeoxyglucose positron emission tomography. Nevertheless ready availability of computed tomography and magnetic resonance imaging machines largely assures anatomically based ...Details
Stefano Allesina, Erik von Elm
Nepotistic practices are detrimental for academia. Here I show how disciplines with a high likelihood of nepotism can be detected using standard statistical techniques based on shared last names among professors. As an example, I analyze the set of all 61,340 Italian academics. I find that nepotism is prominent in Italy, with particular disciplinary sectors being detected as especially problematic. Out of 28 disciplines, 9 – accounting for more than half of Italian professors – display a significant paucity of last names. Moreover, in most disciplines a clear north-south trend emerges, with likelihood of nepotism increasing with latitude. Even accounting ...Details
C. Huang, P. Mattis, K. Perrine, N. Brown, V. Dhawan, D. Eidelberg
We used FDG PET to measure regional glucose metabolism in patients with PD with multiple-domain MCI (MD-MCI; n = 18), with single-domain MCI (SD-MCI; n = 15), and without MCI (N-MCI; n = 18). These patients were matched for age, education, disease duration, and motor disability. Maps of regional metabolism in the three groups were compared using statistical parametric mapping (SPM). We also computed the expression of a previously validated cognition-related spatial covariance pattern (PDCP) in the patient groups and in an age-matched healthy control cohort (n = 15). PDCP expression was compared across groups using analysis of variance.Details
Monthly Notices of the Royal Astronomical Society
A revision of Stodółkiewicz's Monte Carlo code is used to simulate evolution of large star clusters. The new method treats each superstar as a single star and follows the evolution and motion of all individual stellar objects. A survey of the evolution of N-body systems influenced by the tidal field of a parent galaxy and by stellar evolution is presented. The process of energy generation is realized by means of appropriately modified versions of Spitzer's and Mikkola's formulae for the interaction cross-section between binaries and field stars and binaries themselves. The results presented are in good agreement with theoretical expectations ...Details