Tutorial 1 (8:30 - 12:30)
Title
Accelerating Big Data Processing with Hadoop, Spark, and Memcached Over High-Performance InterconnectsSpeakers
Dhabaleswar K. (DK) Panda and Xiaoyi Lu (Ohio State University)Abstract
Apache Hadoop and Spark are gaining prominence in handling Big Data and analytics. Similarly, Memcached in Web-2.0 environment is becoming important for large-scale query processing. Recent studies have shown default Hadoop, Spark, and Memcached can not leverage the features of modern high-performance computing clusters efficiently, like Remote Direct Memory Access (RDMA) enabled high-performance interconnects, high-throughput and large-capacity parallel storage systems (e.g. Lustre). These middleware are traditionally written with sockets and do not deliver best performance on modern high-performance networks. In this tutorial, we will provide an in-depth overview of the architecture of Hadoop components (HDFS, MapReduce, RPC, HBase, etc.), Spark and Memcached. We will examine the challenges in re-designing networking and I/O components of these middleware with modern interconnects, protocols (such as InfiniBand, iWARP, RoCE, and RSocket) with RDMA and storage architectures. Using the publicly available software packages in the High-Performance Big Data (HiBD, http://hibd.cse.ohio-state.edu) project, we will provide case studies of the new designs for several Hadoop/Spark/Memcached components and their associated benefits. Through these case studies, we will also examine the interplay between high-performance interconnects, storage systems (HDD and SSD), and multi-core platforms to achieve the best solutions for these components and Big Data applications on modern HPC clusters.
Bio
Dhabaleswar K. (DK) PandaDhabaleswar K. (DK) Panda is a Professor of Computer Science at the Ohio State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance computing, communication protocols, files systems, network-based computing, and Quality of Service. He has published over 350 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand, HSE and RDMA over Converged Enhanced Ethernet (RoCE). His research group is currently collaborating with National Laboratories and leading InfiniBand and 10GigE/iWARP companies on designing various subsystems of next generation high-end systems. The MVAPICH2 (High Performance MPI over InfiniBand, iWARP and RoCE) open-source software package, developed by his research group, are currently being used by more than 2,400 organizations worldwide (in 75 countries). This software has enabled several InfiniBand clusters (including the 7th one) to get into the latest TOP500 ranking. These software packages are also available with the Open Fabrics stack for network vendors (InfiniBand and iWARP), server vendors and Linux distributors. The new RDMA-enabled Apache Hadoop and Memcached packages, consisting of acceleration for HDFS, MapReduce, RPC and Memcached and support for clusters with Lustre file systems, are publicly available from http://hibd.cse.ohio-state.edu. Dr. Panda's research is supported by funding from US National Science Foundation, US Department of Energy, and several industry including Intel, Cisco, SUN, Mellanox, QLogic, NVIDIA and NetApp. He is an IEEE Fellow and a member of ACM. More details about Dr. Panda, including a comprehensive CV and publications are available here.
Xiaoyi LuDr. Xiaoyi Lu is a Senior Research Associate in the Department of Computer Science and Engineering at the Ohio State University, USA. He obtained his Ph.D. degree in Computer Science from Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China. His current research interests include high-performance interconnects and protocols, Big Data, Hadoop/Spark Ecosystem, Parallel Computing Models (MPI/PGAS), GPU/MIC, Virtualization and Cloud Computing. He has published over 40 papers in major journals and international conferences related to these research areas. He has been actively involved in various professional activities in academic journals and conferences. Recently, Dr. Lu is doing research and working on design and development for the High-Performance Big Data project (http://hibd.cse.ohio-state.edu). He is a member of IEEE. More details about Dr. Lu are available here.
Tutorial 2 (8:30 - 12:30)
Title
ONOS Tutorial Hot InterconnectSpeaker
Thomas Vachuska, Madan Jampani, Ali Al-Shabibi and Brian O'Connor (ONOS)Abstract
ONOS is an open source SDN network operating system architected for service provider and mission critical networks. A logically centralized but physically distributed architecture enables ONOS deliver high performance, scale and high availability. ONOS has well defined northbound and southbound abstractions which enable it to support a diversity of applications, devices and protocols. This tutorial aims to get SDN enthusiasts familiarized with ONOS and provide them with the hands-on experience of developing and testing an ONOS application.
About ONOS: ONOS was open sourced on Dec 5th 2014. The ONOS ecosystem comprises of ON.Lab, organizations who are funding and contributing to the ONOS initiative including Tier 1 service providers AT&T, NTT Communications, SK Telecom, China Unicom, leading vendors Ciena, Cisco, Ericsson, Fujitsu, Huawei, Intel, NEC_ members who are collaborating and contributing to ONOS include ONF, Infoblox, SRI, Internet2, Happiest Minds, CNIT, Black Duck, Create-Net, KISTI, KREONET and the broader ONOS community. You can learn more about ONOS at onosproject.org.
Bio
Thomas Vachuska Chief ArchitectThomas joins ON.Lab after a 27-year career in HP and brings with him a solid background in software architecture of distributed systems and modular object-oriented design. He is an avid proponent of agile & test-driven development processes and is an unceasing advocate for elegance through simplicity. Thomas continues to stay familiar with a wide field of evolving technologies and software development tools and manages to remain actively engaged in code development.
While at HP, Thomas worked in a number of different divisions, where he architected, designed and helped develop distributed software systems for a variety of domains, spanning from manufacturing control, network & storage management, data-deduplication, and most recently, software defined networking. His tenure included two extended trips to Bblingen, Germany, where he worked with local teams to develop custom manufacturing control systems. Thomas is a principal author of a number of software patents held by HP.
Thomas studied mathematics and physics at Charles University in Prague and after immigrating to the United States in 1982, continued his education at California State University, Sacramento where he earned a BA degree in mathematics.
Outside of work, Thomas enjoys traveling, skiing, kayaking, biking and generally spending time outdoors with his wife Carol and their grown children.
Madan Jampani Distributed Systems ArchitectPrior to joining ON.Lab Madan worked as a Senior Software Engineer at Amazon.com where he started his career 10 years ago. At Amazon, Madan was instrumental in building several key technologies ranging from Amazon retail ordering systems, distributed data stores and shared compute clusters for running large-scale data processing and machine learning workloads. Madan brings expertise in distributed systems having been instrumental in the creation of Dynamo, a NoSQL data store that is a highly cited and seminal work in the area of distributed key-value storage technologies.
Madan earned his M.S. in Computer Science from Georgia Tech, and his B.Tech in Computer Science and Engineering from National Institute of Technology, Warangal, India. Outside of work, Madan enjoys hiking, playing tennis and reading about various science related topics. He and his wife Swathi have two beautiful daughters who keep them fully occupied.
Ali Al-Shabibi Lead DeveloperAli Al-Shabibi is the lead engineer and maintainer of FlowVisor, a network hypervisor, at the Open Networking Laboratory. Previously, he was a post-doc at Stanford University researching OpenFlow and SDNs in Nick McKeown's group. He received his Ph.D from the University of Heidelberg in Germany in 2011 after performing his doctoral research at CERN (European Centre for Nuclear Research) in the ATLAS (A Toroidal Lhc ApparatuS) Networking group, where he contributed to the design and development of the TDAQ (Trigger and Data Acquisition ) Network. Ali Al-Shabibi brings vast knowledge of flow models and congestion avoidance protocols. He comes to ON.Lab from Stanford and CERN, where he analyzed large, mountainous systems, such as 'Portes du Soleil' and 'Argentiere.' Urban wannabe, unrelenting espresso consumer and dedicated traveler, Ali Al-Shabibi is an avid soccer and table tennis player. While he was born in Baghdad, he grew up in Geneva, Switzerland, where he attended the Swiss Federal Institute of Technology (EPFL) for his BSc and MSc degrees. Nowadays, Al-Shabibi can be found theorizing and philosophizing about SDNs or dreaming up cool networking applications with the ONRC crew over many coffees.
Brian O'Connor Lead DeveloperBrian O'Connor received Bachelor's and Master's degrees in Computer Science from Stanford University. At Stanford, he helped develop ŇAn Introduction to Computer Networking,Ó one of Stanford's first MOOCs (Massively Open Online Courses), in addition to teaching Computer Networking and Digital Photography. His academic focus spanned theory, systems, and computer networking. At ON.Lab, Brian is one of the core developers on the Mininet project. Out of the office, Brian enjoys staying active on his bike, on the slopes, and in the pool.
Tutorial 3 (13:30-17:30)
Title
Flow and Congestion Control for High Performance Clouds: How to Design and Tune the Datacenter Fabric & SDN for Big DataSpeaker
Mitch Gusat, IBM Research (Zurich Laboratory)Abstract
Driven by increasingly delay-sensitive workloads, a proliferation of open-sourced stacks and evolving IETF/IEEE standards - e.g. Spark, flattened datacenter fabrics, DC/Multipath-TCP, and Converged Enhanced Ethernet (CEE) - wired datacenter networking undergoes a silent, yet disruptive transition. Concurring to a 'perfect storm' in the Cloud, the VM/container-based virtualization and Software Defined Networking (SDN) technologies introduce new protocols - i.e. performance challenges and opportunities, as we shall prove - inserted in the data plane between the TCP/UDP socket stack and the physical network
E.g., a physical opportunity in the datacenter fabric is the rise of the lossless 50/100/400Gbps Ethernet, initially converging the cluster and storage networks. After a brief 101 on the must-know basics of flow and congestion controls, the first part of the tutorial introduces - our previously HPC-tested! - practical methods on how to apply Ethernet/Infiniband's control features to achieve stable and robust delay distributions across a variety of datacenter workloads.
Next we shift our focus on SDN and virtualization's purview on Big Data workload performance, addressing the emerging challenges of a virtualized Hadoop/Spark datacenter configuration. Using similar (again HPC-like) design principles as above, we exemplify the state-of-the-art in SDN performance with some recently published datacenter networking schemes that can deliver order of magnitude performance and fairness benefits - e.g., by eliminating TCP incast throughput collapse and by reducing the flow-completion time of latency-critical Hadoop/HPC applications.
Bio
Mitch GusatMitch Gusat is a researcher and Master Inventor at the IBM Zurich Research Laboratory. His current focus is on datacenter and cloud fabrics, virtual networking and their feedback control and performance, modeling of distributed systems, SDN, scheduling, switching and lossless datacenter networks beyond 400Gbps including their flow and congestion control, adaptive routing, workload optimization and monitoring. In this area he has actively led or contributed to the design and standardization of Converged Enhanced Ethernet/802 DCB, InfiniiBand and RapidIO - while also advising his Master and PhD students from several European universities. His other research interests include control, optimization, SDN, HPC interconnection networks, shared (virtual) memory, real-time scheduling, high performance protocols and IO acceleration. Previously he was a Research Associate at the University of Toronto where he contributed to the design and construction of NUMAchine, a 64-way cache-coherent computer. In a former lifetime, Mitch was student and then researcher at the "Politehnica" University of Timisoara, where he has designed multiprocessor systems, parallel video interfaces, algorithms and image processors for Nuclear Cardiology. He holds Masters in CE, resp. EE, from the above universities. He is member of ACM, IEEE, and holds a few dozen patents related to SDN, transports, HPC architectures, switching and scheduling.
Tutorial 4 (13:30-17:30)
Title
Software-defined Wide-Area Networking: Challenges, Opportunities and RealitySpeaker
Inder Monga, Division Deputy for Technology, Chief Technology Officer for Energy Sciences Network & Srini Seetharaman, Cloud Platform Architect at Infinera and Founder of SDN HubAbstract
While it has been over 3 years since Google publicly commented about using Software-defined Networking (SDN) for the wide-area interconnection in its G-Scale network, the field of Software-defined Interconnections are still nascent as ever. We have several forums discussing the advantages of SD-WAN, SD-Exchanges and SD-Interconnection, with different target users ranging from enterprises to HPC clouds to service providers. There is, however, no clear consensus on the abstractions and potential of SDN in this multi-faceted spaced.
The goal of this tutorial is to shed light on the different aspects of SDN for Wide-area Networking, and providing attendees with clarity on:
- the challenges in traditional approaches with WAN management and orchestration,
- the vision of where WAN can grow to,
- what software-defined networking abstractions are relevant in the WAN domain and what traditional abstractions are still worth preserving, and
- top use-cases (categorized by users) for software-defined control for Wide-area Networking and associated deployment challenges in the real world.
Bio
Inder MongaIndermohan (Inder) S. Monga serves as the Division Deputy of Technology of Scientific Networking Division, Lawrence Berkeley National Lab and CTO of Energy Sciences Network. Mr. Monga plays a key role in developing and deploying advanced networking services for collaborative and distributed "big-data" science. Mr. Monga's research interests include software-defined networking, network virtualization, SDX, energy efficiency and distributed computing. Recently, he is working actively on research and technology development that focuses towards the broad adoption of SDN in the wide-area network including recent work on Transport SDN. He is also appointed as Chair of ONF Research Associates and contributes to SDN standards. He currently holds 17 patents and has over 15 years of industry and research experience in telecommunications and data networking.
Srini SeetharamanSrini Seetharaman is Cloud Platform Architect at Infinera and Founder of SDN Hub. Previously he was a Technical Lead for Software-defined Networking (SDN) at Deutsche Telekom Innovation Center. Before that he was member of the OpenFlow/SDN team at Stanford where he led the SDN deployments in several nation-wide campus enterprise networks, including Stanford. He holds a Ph.D. in Computer Science from the Georgia Institute of Technology.