Scientific investigation is highly collaborative and requires the ability to seamlessly share data between institutions to enable scientific discovery. However, effective data sharing (especially for large data sets) can be challenging, and it is still not uncommon for researchers to resort to shipping disks instead of using the network for data delivery due to network performance issues.
The ability to measure and interpret network behavior is critical to understanding data transfer performance. Information about the end-to-end data path makes it possible to identify and address potential problems.
The NetSage Measurement and Analysis Framework was developed specifically to understand pragmatic use of research and education networks and to evaluate data transfer performance. NetSage integrates multiple data sources to support objective performance observations as a whole. NetSage deployments can collect data from routers or switches (such as SNMP or Flow), active testing sites (such as perfSONAR), and science data archives (using Tstat). A NetSage deployment uses a combination of passive and active measurements to provide longitudinal performance visualizations via performance dashboards. These dashboards can be viewed by resource collection, institutions, or projects to identify changes of behaviors for data transfers using visualizations of data over time periods.
NetSage Dashboards can answer several different types of questions about usage. For example, a Heatmap from the Pacific Wave Portal Flow Data Dashboard uses flow data to measure data transfers to and from the Zoom video conferencing hosting site during the time frame where R&E institutional use of Zoom changed radically. In March 2020, when universities started to respond to the COVID-19 pandemic-related restrictions, network resource owners needed to know how their systems were responding to the change of use. The Heatmap shows data volumes starting in February that increased on/around March 12, when many US universities declared that researchers could not travel. This was followed ten days later by a decrease, likely caused by a combination of institutions shifting to Spring Break, institutions issuing work from home directives (so the traffic shifted to home networks not R&E networks), and Zoom shifting some of its hosting to use cloud services rather than their own IP space.
The main use cases for the NetSage Framework have included: understanding the data movement patterns across a suite of resources; identifying the main sources and destinations for large data transfers, or flows; visualizing information about different research projects and science domains that are moving data; and displaying patterns of behaviors for data movement between organizations.
Currently, a broad set of Quilt members have deployed NetSage as part of their partnership with the Engagement and Performance Operations Center (EPOC). These include: Front Range GigaPop (FRGP); Great Plains Network (GPN); iLight/Indiana GigaPop; KINBER; LEARN; Pacific Northwest GigaPop/PacificWave; SoX; Sun Corridor Network; and TACC. Contact [netsage at iu dot edu] to learn more.
Founded in 1820, Indiana University is one of the world’s foremost public institutions. With nearly 100,000 students and more than 20,000 employees statewide, IU continues to pursue its core missions of education and research while building a foundation for the university’s enduring strengths in teaching and learning, world-class scholarship, innovation, creative activity, community engagement and academic freedom. Bloomington is the flagship campus of the university, and each one of IU’s seven campuses is an accredited, four-year degree-granting institution.