Table of contentsSectionAbout the editorData across the federal government5Big data at the nal scieBig Data Interagency Working Group(IwoBig data to knowledge at nihThe role of big data, Machine learning, and al in assessingRisks: a Regulatory P(FTC ROProvides redations to businesseResponder News: Big Data and Public SafetyHas big data Made Us lazyThe Real Challenge of Big Data
BioSense 20 is the first system to take into account the feasibility of regiontionalcoordination for public health situation awareness through an interoperable network of systemsbuilt on existing state and local capabilities BioSense 2
0 removes many of the costs associatedwith monolithic physical architecture, while still making the distributed aspects of the systeransparent to end users, as well as making data accessible for appropriate analyses andNetworked phylogenomics for bacteria and outbreak ID CDC's Special Bogy ReferenceLaboratory(SBRL)identifies and classifies unknown bacterial pathogensutbreak detection Phylogenomics, the comparative phylogenetic analysis of the entire genorDNAnce,will bring the concept of sequence-based identification to an entirely new letpIcations on puspecies identification will allow for multiple analapidly emerging pathogen to be performed in hours, rather than days or weeksCenter for Medicare Medicaid Services(CMS)a data warehouse based on Hadoop is being developed to support analytic and reportingequirements from Medicare and Medicaid programs A major goal is to develop a supportable,ustainable and scalable design that accommodates accumulated data at the warehouse levelso challenging is developing a solution complements existing technologieshe use of XMl database technologies are being evaluated to supportransactionantensive environment of the lnsurance excort the eligibility andnrollment processes, XML databases potentiallite Big tables scale data, bioptimized for transactional performancesing administrative claims data(Medicare)to improve decision-making: CMs has a current setof pilot projects with the Oak Ridge National laboratories that involve the evaluation of dataisualization tools, platform technologies, user interface options and high performancecomputing technologies--aimed at using administrative claims data(Medicare)to createinformation products to guide and support improved decision-making in various CMs highpriority programsOOD AND DRUG ADMINISTRATION (FDA)A Virtual Laboratory Environment (VLE)will combine existing resources and capabilities toenable a virtual laboratory data network, advanced analytical and statistical tools andcapabilities, crowd sourcing of analytics to predict and promote public health, document
management support, tele-presence capability to enable worldwide collaboration, and basicallymake any location a virtual laboratory with advanced capabilities in a matter of hourNATIONAL ARCHIVES reCords AdmInistration (naraThe Cyberinfrastructure for a Billion Electronic Records(CI-BER)is a joint agency sponsoredapplication of a multi-agency sponsored cyber infrastructure and theformation nowComputing institute This testbed will evaluate technolapproaches to support sustainable access to ultra-large data collectionsNATIONAL AERONAUTICS SPACE ADMINISTRATION (NASA)SAs Advanced Information Systems Technology (AlST)awards seek to reduce the risk andcost of evolving NASa information systems to support future Earth observation missions and totransform observations into earth information as envisioned by nasas climate centricArchitecture Some aIsgrams seek toare Big Data capabilities to reduce the risk, costize and development time of Earth Science Division space-based and ground-based informationsystems and increase the accessibility and utility of science dataNASAS Earth Science Data and Information System(ESDIS) project, active for over 15 yearshas worked to process, archive, and distribute Earth science satellite data and data from airbornand field campaigns
with attention to user satisfaction it striveand thepublic have access to data to enable the study of Earth from space to advance Earth systercience to meet the challenges of climate and environmental changeof Systems(GEOSS)is a collaborative, internationeffort to share and integrate earth observation data, nasa has joined forcesthe usEnvironmental Protection Agency(EPA), National Oceanic and Atmospheric Administratie(NOAA), other agencies and nationgrate satellite and ground -based monitoring andmodeling systems to evaluate environmental conditions and predict outcomes of events such asest fires, population growth and other developments that are natural and man-made In theear-term, with academia, researchers will integrate a complex variety of air quality informaticto better understand and address the impact of air quality on the environment and human healthA Space Act Agreement, entered into by NASa and Cray, Inc, allows for collaboration on oneor more projects centered on the development and application of low-latency, "big dataystems In particular, the project is testing the utility of hybrid computers systems using a highlyintegrated non-sQL database as a means for data delivery to accelerate the execution ofdeling and analy
NASAS Planetary Data System(PDS)is an archive of data products from NASa planetarymissions, which has become a basic resource for scientists around the world All PDS-producedproducts are peer-reviewed, well-documented, and easily accessible via a system of onlicatalogs that are organized by planetary disciplineshe Multimission Archive at the Space Telescope Science Institute(MAST), component ofNASAS distributed Space Science Data Services, supports and provide to the astronomicalcommunity a variety of astronomical data archives, with the primary focus on scientificallylated data sets in the optical, ultraviolet, and near-infrared parts of the spectrum MASThives andovide access to a variety of spectral and image data TheEarth System Grid Federation is a public archive expected to support the research underlying theternational Panel on Climate Changes Fifth Assessment Report to be completed in 2014(as itdid for the Fourth Assessment Report)
NASA is contributing both observational data and modeloutput to the Federation through collaboration with the doeNATIONAL INSTITUTES OF HEALTH (NIH)National cancer institute (ncile Cancer Imaging Archive (TCIA)is an image data-sharing service that facilitates opercience in the field of medical imaging TCia aims to improve the use of imaging in today'scancer research and practice by increasing the efficiency and reproducibility of imaging cancerdetection and diagnosis, leveraging imaging to provide an objective assessment of therapeuticesponse, and ultimately enabling the development of imaging resources that will leadd clinical dsupporte Cancer Genome Atlas(TCGA) project is a comprehensive and coordinated effortccelerate understanding of the molecular basis of cancer through the application of genomeanalysis technologies, including large-scale genome sequencing With fast development of largescale genomic technology, the TCGa project will accumulate several petabyte of raw data bNational Heart Lung and Blood Institute (NhlBi)The Cardiovascular Research Grid (CVRG)and the Integrating d:AnalAnonymization and Sharing (iDASH) are two informatics resources supported by nhlBi whiclprovide secure data storage, integration, and analysis resources that enable collaboration whileminimizing the burden on users The Cvrg provides resources for the cardiovascular researchcommunity to share data and analysis tools iDASH leads development in privacy- preservingtechnology and is fosterintegrated data sharing and analysis environment
National Institute of Biomedical Imaging and Bioengineering (NIBIBThe Development and Launch of an Interoperable and Curated Nanomaterial Registry, led by tNIBIB institutks to establish a nanomaterial registry, whose primary functionovidesistent and curated information on the biological and environmental interactions of wecharacterized nanomaterials, as well as links to associated publications, modeling tools,computational results and manufacturing guidances The registry facilitates building standardsnd consistent information on manufacturing and characterizing nanomaterials, as well as theirbiological interacthe Internet Based Network for Patient-Controlled Medical Image Sharing contract addressese feasibility of an image sharing model to test how hospitals, imaging centers and physicianpractices can implement cross-enterprise document sharing to transmit images and image reportsAs a Research Resource for Complex Physiologic Signals, PhysioNet offers free web accesslarge collections of recorded physiologic signals(Physio Bank)and related open-source softwarePhysioToolkit) Each month, about 45,000 visitors worldwide use PhysioNet, retrieving about 4dale Neuroimaging Informatics Tools and Resource Clearinghouse(NITRC) is a NIH blueprintote the dissemination sharing adoption and enformatics tools and neuroimaging data by providing access, information and forums forinteraction for the research community Over 450 software tools and data sets are registered ornitRC the site has had over 301 million hits since its launch in 2007The extensible nong Archive Toolkit (XNAT) is an openinformaticplatform, developed by the Neuroinformatics Research Group at Washington University, andidely used by research institutions around the world
XNAT facicommon managemenproductivity and quality assurance tasks for imaging and associated datale Computational Anatomy and Multidimensional Modeling Resource The Los AngelesLaboratory of Neuro Imaging (LOND)houses data bases that contain imaging data from severalmodalimostly various forms of MR and pet, gerdata The Alzheimer's Disease Neuroimaging Initiative(ADNi)is a good example of a projectthat collects data from acquisition sites around the Us, makes data anonymous, quarantines ituntil quality control is done(often immediately) and then makes it available for download tousers around the world in a variety of formatsThe Computer-Assisted Functional Neurosurgery Database develops methods and techniques todDeep brain StimulatorsSs)used for thetreatment of parkinsons disease and other movement disorders a central database has beendeveloped at Vanderbilt University (VU), which is collaborating with Ohio State and Wake
tes since the clinical workflow and thetereotactic frames at different hospitals can vary, the surgical planning software has beenupdated and successfullyNIH Biomedical Information Science and Technology Initiative(BISTI) Consortium for over adecade has joined the institutes and centers at NIh to promote the nations researiomedical Informatics and Computational Biology(biCE, promoted a number of programannouncements and funded more than a billion dollars in research In addthe collaborationhas promoted activities within NIH such as the adoption of modern data and software sharingpractices so that the fruits of research are properly disseminated to the research communNIH BlueprintFramework(NF) is a dyof Web-basedneuroscience resources: data, materials, and tools accessible via any computer connected to theInternet Aof the nih bnt for neurosResearch
NIF advancesneuroscience research by enabling discovery and access to public research data and toolsworldwide through an open source, networked environmentNIH Human Connectome Project is an ambitious effort to map the neural pathways thatunderlie human brain function and to share data about the structural and functional connectivityf the human brain The project will lead to major advances in our understanding of what makesus uniquely human and will set the stage for future studies of abnormal brain circuits in manyneurological and psychiatric disordersNIH Common fundThe National Centers for Biomedical Computing(NCBC) are intended to be part of the nationalinfrastructure in Biomedical Informatics and Computational Biology The eight centers createinnovative software programs and other tools that enable the biomedical community to integrateanalyze, model simulate, and share data on human health and diseaseReported Outcomes MeSystem(PROMis)is a system of highlyeliable, valid, flexible, precise, and responsive assessment tools that measure patient-reportedstatus a core resource is the assessment centerdes tools and a databasehelp researchers collect, store, and analyze data related to patient health statusNatif general medical sciences
Models of Infectious Disease Agent Study (MiDAs)is an effort to develop computationaland analytical approaches for integrating infectious disease information rapidly and providingnformation must also be fine-grained, with needs for data access, management, analysis ay 29modeling results to policy makers at the local, state, national, and globaWhile data nto be collected and integrated globally, because public health policies are implemented locale structural genomics initiative advances the discovery analysis and dissemination of threedimensional structures of protein, RNA and other biological macromolecules representing theentire range of structural diversity found in nature to facilitate fundamental understandingbiology, agriculture and medicine Worldwide efforts include the NId fundedtein Structure initiative Structural genomics centers for Infectious diseases StructuStockholm and the riken Systems and structural biology centapan
These efforts coordinate their sequence target selection through a central databaseTargetDB, hosted at the Structural Biology KnowledgebaseWorld Wide Protein Data Bank(wwPDB ) a repository for the collection, archiving and freedistribhigh quality macromostructural datimely basis, represents the preeminent source of experimentally determined macromoleculartructure informatiresearch and teaching in biology, biological chemistry, and medicineThe Us component of the project(RCSB PDB)is jointly funded by five Institutes of NIhDOE/BER and NSF, as well as participants in the UK and Japan The single databank nowcontains experimental structures and related annotation for 80,000 macromolecular structuresThe Web site receives 211,000 unique visitors per month from 140 different countries Arounderabyte of data are transferred each month from the websiteThe biomedical Informatics Research Network(BIRN a national initiative to advaneomedical research through data sharing and collaboration, provides a user-driven, softwarebased frameworkearch teto share siint quantities of data-rapidIprivately -across geographic distance and/or incompatible computing systems, serving diverseresearch communitiesNational library of medicinenformatics for Integrating Biology and the Bedside(i2b2), seeks the creation of tools andhealthcare and biomedical research Software tools foting datathat were developed by 12b2 are used at more than 50 organizations worldwide through open
Office of Behavioral and Social Sciences (OBSSR)The National Archive of Computerized Data on Aging(NACDA) program advances research onaging by helping researchers to profit from the under-exploited potential of a broad range ofdatasets NACD preserves and makes available the largest library of electronic dathe United stateData Sharing for Demographic Research(DSDR) provides data archiving, preservationdissemination and other data infrastructure services DSDR works toward a unified legalechnical and substantive framework in which to share research data in the population sciencesA Joint Nih- nSf ProgramThe Collaborative Research in Computational Neuroscience(CRCNS)isnt NIh-NsFprogram to support collaborative research projects between computational scientists andmechanisms underlying nervous system disorders and computational strategies used bytha 2,neuroscientists that will advance the understanding of nervous system structure and functienervous system In recent years, the German Federal Ministry of Education and Research hasalso joined the program and supported research in germanyNATIONAL SCIENCE FOUNDATION (NSF)C
ore Techniques and Technologies for Advancing Big Data Science Engineering(BIGDATA)een NSf and nih that aims to advance the core scientific andechnological means of managing, analyzing, visualizing and extracting useful information fromge, diverse, distributed and heterogeneous data setsSpecifically, it will support thedevelopment and evaluation of technologies and tools for data collection and management, date collaborations which will enable breakthrough discoveries andinnovation in science, engineering, and medicine-laying the foundations for USCyberinfrlidates coordinates andof advanced cyberinfrastidefforts across NSF to create meaningful cyberinfrastructure, as well as develop a level ofgration and interoperabilityand tools to support science and educationCIF2I TrackGERT NSf has shared with its community plans to establish a new CIFtrack as part of its Integrative Graduate Education and Research Traineeship (IGErT) programThis track aims to educate and support a new generation of researchers able to address
About the editorMichael Erbschloe has worked for over 30 years performing analysis of theeconomics of information technology, public policy relating to technology, andutilizing technology in reengineering organization processes He has authoredseveral books on social and management issues of information technology thatwere published by Mc Graw Hill and other major publishers He has also taught atseveral universities and developed technology-related curriculum
His career hasfocused on several interrelated areasTechnology strategy, analysis, and forecastingTeaching and curriculum developmentoks and articlesPublishing and ediPublic policy analysis and program evaluationBooks by michael erbschloeThreat Level Red: Cybersecurity Research Programs of theUS Government(CRC Press)Social Media Warfare: Equal Weap Access to Improve Organizationalons for All(auerbach Publications)Security(Auerbach Publications)Physical Security for IT(Elsevier SciTrojans, Worms, and Spyware(Butterworth-Heinemann)Implementing Homeland Security in Enterprise IT(Digital Press)Guide to Disaster Recovery( Course TechnologySocially responsible IT Management(Digital PressInformation Warfare: How to Survive Cyber Attacks(McGraw Hill)The Executive's Guide to Privacy Management(McGraw hill)Net Privacy: A Guide to Developing Implementing an e-blPrivacy Plan(McGraw Hill)
ntroductionIn March 2012, the Obama Administration announced the"Big Data Research and developmentInitiative By improvingbility to extradge and insights from large and complollections of digital data, the initiative promises to help accelerate the pace of discoveryscience and engineering, strengthen our national security, and transform teaching and learningin new commitments that, together, promise to greatly improve the tools and techniques lcde oo launch the initiative, six Federal departments and agencies announced more than $200 millioto access, organize, and glean discoveries from huge volumes of digital dataSome companies are already sponsoring Big Data-relatedanduniversity research
Universities are beginning to create new courses--and entire courses ofData withoutBorders are helping non-preproviding pro bono data collection, analysis, andvisualization OSTP would be very interested in supporting the creation of a forum to highlightnew public-private partnerships related to Big Data
Big data across the Federal governmentMarch 29 2012Below are highlights of ongoing Federal government programs that address the challenges of,and tap the opportunities afforded by, the big data revolution to advance agency missions andfurther scientific discovery and innovationDEPARTMENT OF DEFENSE (DODData to Decisions: The Department of Defense (DOD)is"placing a big bet on big dataa,vesting $250 million annually(with S60 million available for new research projects)across theMilitary Departments in a series of programs that willHarness and utilize massive data in new ways and bring together sensing, perception anddecision support to make truly autonomous systems that can maneuver and make decisions onto operations, The Department is seeking a 100-foldse in the ability ofnformation from texts in any language, and a similar increase in the number of objectsactivities, and events that an analyst can observeBig data that meets these and other rets, dod willannounce a series of open prize competitions over the next several monthsDEFENSE ADVANCED RESEARCH PROJECTS AGENCY (DARPAtectionle Scales(ADAMs) program addresses the problemnomaly-detection and characdatadataare intended to cue collection of additional, actionable information in a wide variety of realworld contexts
TADAMS application domainder -threat detectianomalous actions by an individual are detected against a background of routine networkle Cyber-Insider Threat( CINDER) program seeks to develop novel approaches to detectactivities consistent with cyber espionage in military computer networks As a means to exposehidden operations, CINDER will apply various models of adversarynormal
activity on internal networks CINDER also aims to increase the accuracy, rate and speed withwhich cyber threats are detectedThe Insight program addresses key shortfalls in current intelligence, surveillance andreconnaissance systems Automation aImachine reasoning enable operatorso analyze greater numbers of potential threats ahead ofnsitive situations The Insightprogram aims to develop a resource-management system to automatically identify threatnetworks and irregular warfare operations through the analysis of information from imaging andnon-imaging sensors and other sourcesMachine reading progks to realize artificial intelliplications by developinlearning systems that process natural text and insert the resulting semantic representation into aknowledge basepensive and timefoknowledge representation require expert and associated knowledge engineers to hand craftThe mind's eye program seeks to develop a capability for"visual intelligence"in machinesWhereas traditional study of machine vision has made progress in recognizing a wide rangeobjects and their properties--what might be thought of as the nouns in the description of ascene-Mind's Eye seeks to add the perceptual and cognitive underpinnings needed forrecognizing and reasoning about the verbs in those scenes
Together these technologies couldete visual narrativehe Mission-oriented Resilient Clouds program aims to address security challenges inherent ind computing by developing technologies to detect, diagnose and respond to attacks,ffectively building a"community health system'"for the cloud The program also aims todevelop technologies to enable cloud applications and infrastructure to continue functiual hosts and tasks within the cloud ensemble would bebleProgramming Computation on Encrypted Data(PROCEED)effort seeksercome a major challenge for information security in cloud-comptenvironments byeveloping practical methods and associated modern programming languages for computation ondata thatrypted the entire time itByatingfirst decrypting it, adversaries would have a more difficult time intercepting datae video and Image Retrieval and Analysis Tool (virat) program aims to develop a systeo provide military imagery analysts with the capability to exploiast amount of overheadbeing collected IfviRaT will enable anal-tablish alertactivities and events of interest as they occur VIRAT also seeks to develop tools that wouldnable analysts to rapidly retrieve, with high precision and recall, video content from extremelylarge video libraries
XDATA program seeks to develop computational techniques and softwanalyzing large volumes of semi-structured and unstructured data Central challenges to beaddressed include scalable algorithms for processing imperfect data in distributed data stores andeffective human-computer interaction tools that are rapidly customizable to facilitate visualeasoning for diverse missions
The program envisions open source software toolkits for flexibletware deof large volumes of data fotargeted deferapplicatDEPARTMENT OF HOMELAND SECURITY(DHS)he Center of Excellence on Visualization and Data Analytics(CVADA), a collaboration amongresearchers at Rutgers University and Purdue University (with three additional partneruniversities each) leads research efforts on large, heterogeneous data that First Responders coulduse to address issues ranging from manmade or natural disasters to terrorist incidents; lawenforcement to border security concerns; and explosives to cyber threatsDEPARTMENT OF ENERGY (DOE)Office of Advanced Scientific Computing Research(ASCR)provides leadership to the datamanagement, visualization and data analytics communities including digital preservation andommunity access Programs within the suite include widely used data management technologiesuch as the Kepler scientific workflow system; and Storage Resource Management standard; avariety of data storageent techch as bestman the bulk data me Adaptable IO System(ADIOS); FastBit data indexing technology (used by Yahoo! ) and twoThe High Performance Storage System(HPSs)is software that manages petabytes of data ondisks and robotic tape systems Developed by doE and iBM with input frorlabs around the world, HPSS is used by digital libraries, defense applications and a range ofdisciplines including nanotechnolimaging, nuclear physics, computational fluid dynamics, climate science, etc, as well asNorthrop grumman, NASA and the library of CongressMathematics for analysis of petascale data addresses the mathematical challenges of extrainsights from huge scientific datasets and finding key features and understanding theelationships between those features Research areas include machine learning, real-time analysisf streaming data, stochastic nonlinear data-reduction techniques and scalable statistical analysisechniques applicable to a broad range of DOE applications including sensor data from the
he Next Generation Networking program supports tools that enable research collaborations tofind, move and use large data: from the globus Middleware Project in 2001, to the gridFTP datatransfer protocol in 2003, to the Earth Systems Grid (EsG)in 2007 Today, GridFTP serversmove over I petabyte of science data per month for the Open Science Grid, ESG, and Biologys aso been leverasged by a collaboration of Texd oils to train students-of-the-artpetroleum engineering methods and integrated workflowsffice of Basic Energy Sciences(BESBES Scientific User Facilities have supported a number of efforts aimed at assisting users withdata management and analysis of big data, which can be as big as yerabytes(10 12 bytes)of dataer day from a single experiment For example, the Accelerating Data Acquisition, Reductionnd Analysis(ADARA)project addresses the data workflow needs of the spallation NeutronCoherent X-ray Imaging Data Bank has been created to maximize data availability andefficient use of synchrotronsourcele Data and Communications in Basic Energy Sciences workshop in October 201I sponsoredES and ASCR identified needs in experimental data that could impact the progress ofscientific discovery
he Biological and Environmental Research Program(BER), Atmospheric Radiationmeasurement(ARM) Clins a multi-platform scientific user facility thatprovides the internaticommunity infrastructure for obtaininof key atmospheric phenomena needed for the advancement of atmospheric processunderstanding and climate models arm data are available and used as a resource for over 100ournal articles per year Challenges associated with collecting and presenting the high temporalresolution and spectral information from hundreds of instruments are being addressed to meetThe Systems BKnowledgebase(Kbase)is a community-driven software frameworkenabling data-driven predictions of microbial, plant and biological community function in arenvironmental context, kbase was developed with an open design to improve algorithmicdevelopment and deployment efficiency, and for access to and integration of experimental datafrom heterogeneous sources Kbase is not a typicalpase but a means to interpret missingThe Office of Fusion Energy Sciences(FES)
Scientific Discovery through Advanced Computing (SciDAC) partnership between FES andthe office of Advanced Scientific Computing Research(ASCR)addresses big data challengesssociated with computational and experimental research in fusion energy science The datamanagement technologies developed by the ASCr-FES partnerships include high performanceput/output systems, advanced scientific workflow and provenance frameworks, andisualization technifusion needs which have attracted theEuropean integrated modeling efforts and ITER, an international nuclear fusion research ande Office of High Energy Physics(HEP)The Computational High Energy Physics Program supports research for the analysis of largomplex experimental data sets as well as large volumes of simulated data-an undertaking thatypically requires a global effort by hundreds of scientists
Collaborative big data managementventures include PanDA(Production and Distributed Analysis) Workload Management Systenand XRootD, a high performance, fault tolerant software for fast, scalable access to dataepositories of many kindshe office of Nuclear Physics(NP)The US Nuclear Data Program(USNDP) is a multisite effort involving seven national labs ando universities that maintains and provides access to extensive, dedicated databases spanningysics, which compile and cross-check all relevant eon important properties of nuclehe Office of Scientific and Technical Information (OSTIOSTI, the only US federal agency member of Data Cite(a global consortium of leadingcientific and technical information otechnicementations of the practice of data citation, which enables efficient reuse anderification of data so that the impact of data may be tracked, and a schestructure thcognizes and rewards data producers may be establishedDEPARTMENT OF VETERANS AFFAIRS (VA)
Consortium for Healthcare Informatics Research( Chir)develops Natural Language Processin(NLP) tools in order to unlock vast amounts of information that are currently stored in Va as(Pro Watch: Efforts in the va are underway to produce transparent, reproducible and reusabletware for surveillance of various safety related events Pro Watch is a research-basedsurveillance program that relies on newly developed informatics resources to detect, track, andmeasure health conditions associated with military deploymentAVivA is the Vas next generation employment humansystem that willdatabase from the business applications and from the browser-based user interface
Analyticaltools are already being built uponndation for research andly support of decisionsat the patient encounterObservational Medical Outcomes Project is designed to compare the validity, feasibility andperformance ofsafety surveillance analytic methodsCorporate Data Warehouse( CDw)is the va program to organizeurces with delivery to the point of care for a complete view of the disease and treatment findividuals and populationsHealth Data Repository is standardizing terminology and data format among health careproviders and notably between the Va and dOD, allowing the Cdw to integrate dataGenomic Information System for Inciencecare for Veterans through personalized medicine The genISIs consortium servesfor clinical studies with access to the electronic health records and genetic data in order thatclinical trials, genomic trials and outcome studies can be conducted across the valillian Veteran Program is recruiting voluntary contributionod samples from veterans foenotypiort the genisis consortium andwill be attributed to the"phenotype " in the individual veterans health record for understato dVA Informatics and Computing Infrastructure provides analytical workspace and toanalysis of large datasets now available in the VA, promoting collaborative research fromanywhere on the Va networkHEALTH AND HUMAN SERVICES (HHSDisease Control Prevention(CDC