Unit 5_AI Software Ecosystems_testing

Published on Nov 22, 2024

Scene 1 (0s)

[Virtual Presenter] Welcome. In this unit, we’ll cover the software ecosystem that has allowed developers to make use of GPU computing for data science..

Scene 2 (11s)

[Audio] We'll start with a brief overview of VGPU as a foundational technology. From there, we'll move into what frameworks are and their benefits with AI. We'll also provide an overview of the NVIDIA software stack and Cuda X AI software acceleration libraries. Later, we'll move on to NVIDIA containerized software catalog, known as NGC and discuss how NVIDIA is extending AI to every enterprise using virtualization with Nvidia AI Enterprise software suite..

Scene 3 (43s)

[Audio] By the end of this unit, you'll be able to understand virtual GPU as a foundational technology upon which the AI ecosystem sits. Briefly, describe the deep learning stack and kuda. Define the steps that make up the AI workflow. Identify the various types of workflows from open source third-party vendors, as well as those provided by NVIDIA. See what makes up NGC and the Enterprise catalog and discuss their benefits. Walk through and describe the benefits and features of Nvidia AI Enterprise and NVIDIA's provided AI workflows..

Scene 4 (1m 17s)

[Audio] Let's get started.. GPU Virtualization (vGPU).

Scene 5 (1m 23s)

[Audio] Before we get into AI frameworks and the way NVIDIA provides and supports these frameworks, let's take a few minutes to briefly cover VGPU. As a foundational technology, the workplace is experiencing a pandemic disruption that is changing the form and perspective about how we work. The adoption of digital technologies has helped organizations respond to the unprecedented challenges and increasingly make a mobile workforce more prevalent. By 2030 end user computing is expected to grow to $20 billion and 40% of storage and compute shifting towards service-based models. However, to build an enhanced digital workspace for the post pandemic recovery and beyond, we must move beyond defensive short term models and focus on sustainable resilient operating methods. Improved user experience paired with security stands at the forefront of the corporate agenda. In fact, 53% of IT executives report their companies are increasing investment in digital transformation. While 49% are looking to improve efficiencies..

Scene 6 (2m 29s)

[Audio] This is where Nvidia Virtual GPU technology comes into play, allowing it to deliver graphics rich virtual experiences across their user base, whether deploying office productivity applications for knowledge workers, or providing engineers and designers with high performance virtual workstations to access professional design and visualization applications. It can deliver an appealing user experience and maintain the productivity and efficiency of their users. Application and desktop virtualization solutions have been around for a long time, but their number one point of failure tends to be user experience. The reason is very simple. When applications and desktops were first virtualized, GPUs were not a part of the mix. This meant that all of the capture and code and rendering that was traditionally done on a GPU in a physical device was being handled by the CPU in the host. Enter NVIDIA's virtual GPU or VGPU solution. It enables it to virtualize a GPU and share it across multiple virtual machines or VMs. This not only improves performance for existing VDI environments, but it also opens up a whole new set of use cases that can leverage this technology. With our portfolio of virtual GPU solutions, we enable accelerated productivity across a wide range of users and applications. Knowledge workers benefit from an improved experience with office applications, browsers, high definition video, including video conferencing like Zoom, WebEx,and Skype for creative and technical professionals. NVIDIA enables virtual access to the professional applications to typically run on physical workstations, including CAD applications or design applications such as Revit and Maya. It enables GIS apps like ESRI, ArcGIS, pro Oil and Gas apps like petrol financial services like Bloomberg, healthcare apps like Epic or manufacturing apps like Keica, Siemens nx, and SolidWorks, to name a few..

Scene 7 (4m 25s)

[Audio] Our virtual software is available for on-prem data centers and also in the cloud, NVIDIA Virtual PC VPC and virtual apps, V apps software for the knowledge and business workers, Nvidia, RTX virtual workstation VWS for creative and technical professionals such as engineers, architects, and designers. We have a series of courses to walk you through each software offering. Please review the virtualization sales curriculum for more detailed information..

Scene 8 (4m 57s)

[Audio] Let's review how NVIDIA virtual GPU software enables multiple virtual machines to have direct access to a single physical GPU while using the same NVIDIA drivers that our customers deploy on non-virtualized operating systems. On the left hand side, we have a standard VMware ESXI host. VMware has done a great job over the years virtualizing CPU workloads. However, certain tasks are more efficiently handled by dedicated hardware such as GPUs, which offer enhanced graphics and accelerated computing capabilities. On the right side. From the bottom up, we have a server with a GPU running the ESXI hypervisor. When the NVIDIA VGPU manager software or VIB is installed on the host server, we're able to assign VGPU profiles to individual VMs. NVIDIA branded drivers are then installed into the guest OS providing for a high-end user experience. This software enables multiple VMs to share a single GPU or if there are multiple GPU in the server, they can be aggregated so that a single VM can access multiple GPUs. This GPU-enabled environment provides for unprecedented performance while enabling support for more users on a server because work that was done by the CPU can now be offloaded to the GPU.

Scene 9 (6m 15s)

[Audio] Let's review how NVIDIA virtual GPU software enables multiple virtual machines to have direct access to a single physical GPU while using the same NVIDIA drivers that our customers deploy on non-virtualized operating systems. On the left hand side, we have a standard VMware ESXI host. VMware has done a great job over the years virtualizing CPU workloads. However, certain tasks are more efficiently handled by dedicated hardware such as GPUs, which offer enhanced graphics and accelerated computing capabilities. On the right side. From the bottom up, we have a server with a GPU running the ESXI hypervisor. When the NVIDIA VGPU manager software or VIB is installed on the host server, we're able to assign VGPU profiles to individual VMs. NVIDIA branded drivers are then installed into the guest OS providing for a high-end user experience. This software enables multiple VMs to share a single GPU or if there are multiple GPU in the server, they can be aggregated so that a single VM can access multiple GPUs. This GPU-enabled environment provides for unprecedented performance while enabling support for more users on a server because work that was done by the CPU can now be offloaded to the GPU. Most people understand the benefits of GPU virtualization, the ability to divide up GPU resources and share it across multiple virtual machines to deliver the best possible performance, but there are many other benefits delivered by NVIDIA Virtual GPU software included in the Nvidia AI enterprise suite, which go beyond just GPU sharing. With Nvidia VGPU software, it can deliver bare metal performance for compute workloads with minimal overhead running virtualized integrations with partners like VMware provide it a complete lifecycle approach to operational management from infrastructure right sizing to proactive management and issue remediation. These integrations allow it to use the same familiar management tools from hypervisor and leading monitoring software vendors. For deep insights into GPU usage, NVIDIA VGPU supports live migration of accelerated workloads without interruption to end users. This allows for business continuity and workload balancing. The ability to flexibly allocate GPU resources means that it can better utilize the resources in their data center. Since virtualization enables all data to remain securely in the data center, the solution helps to ensure infrastructure and data security..

Scene 10 (8m 57s)

[Audio] . Let's now explore deep learning. We'll start with a brief review of what it is, then walk through an AI workflow. From there, we'll talk about the AI software stack and Cuda X deep learning is a subclass of machine learning..

Scene 11 (9m 13s)

[Audio] From there, we'll talk about the AI software stack and Cuda X deep learning is a subclass of machine learning. It uses neural networks to train a model using very large data sets in the range of terabytes or more of data. Neural networks are algorithms that mimic the human brain in understanding complex patterns. Labeled data is a set of data with labels that help the neural network learn. In the example here, the labels are the objects in the images, cars, and trucks. The errors that the classifier makes on the training data are used to incrementally improve the network structure. Once the neural network-based model is trained, it can make predictions on new images. Once trained, the network and classifier are deployed against previously unseen data, which is not labeled. If the training was done correctly, the network will be able to apply its feature representation to correctly classify similar classes in different situations..

Scene 12 (10m 11s)

[Audio] To understand the AI ecosystem, you have to start with the workflow. The first step is the process of preparing raw data and making it suitable for the machine learning model. Examples of tools for this are Nvidia Rapids and the Nvidia Rapids Accelerator for Apache Spark. Once the data is processed, we move on to the training phase. This is where we teach the model to interpret data. Examples of tools for this are PyTorch, the Nvidia Tau toolkit, and TensorFlow. Next, we refine the data through optimization. An example tool for this is Tensor RT. Finally, we deploy the model making it available for systems to receive data and return predictions. The Nvidia Triton inference server allows the simple deployment of scalable AI models in production.

Scene 13 (10m 59s)

[Audio] So what are frameworks? Frameworks are designed to provide higher level building blocks that make it easy for data scientists and domain experts in computer vision, natural language processing, robotics, and other areas to design, train, and validate AI models. They can be an interface library or tool, which allows developers to more easily and quickly build models. Data scientists use frameworks to create models for a variety of use cases such as computer vision, natural language processing, and speech recognition. For example, MXNet is a modern open source deep learning framework used to train and deploy deep neural networks. It is scalable allowing for fast model training and supports a flexible programming model and multiple languages. The MXNet Library is portable and can scale to multiple GPUs and multiple machines. SCI Kit Learn is a free software machine learning library for the Python programming language. It features various classification regression and clustering algorithms and is designed to interoperate with the Python, numerical and scientific libraries, NumPy and SciPi. TensorFlow is a popular open source software library for data flow programming across a range of tasks. It is a symbolic math library and is commonly used for deep learning applications. Nvidia Isaac Lab is a lightweight application built on Isaac Sim for robot learning. Isaac lab optimizes for reinforcement, imitation and transfer learning and can train all types of robot embodiments. Data scientists can use frameworks to create models for a variety of use cases such as computer vision, natural language processing, and speech recognition..

Scene 14 (12m 41s)

[Audio] The diagram shows the software stack for deep learning. The hardware is comprised of a system which can be a workstation or a server with one or more GPUs. The system is provisioned with an operating system and an NVIDIA driver that enables the deep learning framework to leverage the GPU functions. For accelerated computing, Containers are becoming the choice for development in organizations. NVIDIA provides many frameworks as docker containers through NGC, which is a cloud registry for GPU accelerated software. It hosts over 100 containers for GPU accelerated applications, tools, and frameworks. These containers help with faster and more portable development and deployment of AI applications on GPUs across the cloud. data center and edge, and are optimized for accelerated computing on GPUs. Hence, the stack includes running the Nvidia Docker runtime specific for Nvidia GPUs. The containers include all the required libraries to deliver high-performance GPU acceleration during the processing required for training. The CUDA Toolkit is an NVIDIA groundbreaking parallel programming model that provides essential optimizations for deep learning, machine learning, and high-performance computing, leveraging Nvidia GPUs..

Scene 15 (14m 2s)

NVIDIA Deep Learning Software Stack — NVIDIA's groundbreaking parallel • CUDA programming model Enables GPUs to be • NVIDIA Container Runtime used inside containers — Publicly available containers • NGC Containers optimized to run on NVIDIA GPUs • DL Frameworks — Popular deep learning frameworks available inside the containers • Provides essential optimizations for deep learning, machine learning, and high-performance computing (HPC) leveraging NVIDIA GPUs A range of interfaces can be used Deep Learning Frameworks Deep Learning Libraries CUDA Toolkit Mounted NVIDIA Driver Container OS Containerized Tool NVIDIA Container Runtime For Docker Docker Engine NVIDIA Driver ...4 Host OS.

Scene 16 (14m 23s)

How Do I Build An A1 Platform? Two ways to build an A1 platform Do It Yourself (DIY) NVIDIA A1 Enterprise.

Scene 17 (14m 32s)

[Audio] There are two ways you can go about building an AI platform. You can either take the do-it-yourself approach or leverage NVIDIA AI Enterprise, both of which we'll discuss over the next two sections. Leveraging open-source software has become a mainstream method for AI and machine learning development because it can be collaboratively shared and modified upon distribution. However, building your own AI platform based on open source can be risky without a robust support for production. AI, open source software is often distributed and maintained by community developers without the dedicated resources for quality assurance and verifications. Open source software deployment is often limited to the current GPU architecture and offers only self-service support..

Scene 18 (15m 24s)

[Audio] With Nvidia AI Enterprise, enterprises who leverage the open source practices can build mission-critical applications on top of the NVIDIA AI platform. Nvidia AI Enterprise provides NVIDIA enterprise support and hardware testing and certifications for past, current, and future GPUs. Now that you have an understanding of the two ways you can build an AI platform, let's explore the benefits of the Nvidia AI enterprise solution. In order to use a do-it-yourself or build your own approach or download and use NVIDIA AI Enterprise, all software for either of these approaches is provided in the NVIDIA's NGC and the Enterprise catalog. Let's take a few minutes to explore that. Now. Navigating the world of software stack for AI and accelerated applications is complex. The stack varies by use cases. AI stack is different from HPC simulation apps and Genomic stack is different from the visualization app. The underlying software stack to run a particular application on different platforms from on-prem to cloud, from bare metal to container and from VM to microservices also varies. NGC catalog offers containerized software for AI, HPC data science and visualization applications built by NVIDIA and by our partners. The containers allow you to encapsulate the application and its complex dependencies in a single package, simplifying and accelerating end-to-end workflows and can be deployed on premises in the cloud or at the edge. NGC also offers pre-trained models across a variety of domains and AI tasks such as computer vision and LP and recommender systems. Such pre-trained models can be fine-tuned with your own data, saving you valuable time when it comes to AI model development. Finally, for consistent deployment, NGC also has helm charts that allow you to deploy your application and NGC collections, which bring together all the necessary building blocks, helping you build applications faster. The pre-trained models in the NGC catalog are built and continually trained by NVIDIA experts. For many of our models, we provide model resumes. They're analogous to a potential candidate's resume. You can see the data set. The model was trained on training epochs, batch size, and more importantly, its accuracy. This ensures that users can find the right models for their use case..

Scene 19 (17m 56s)

NGC and the Enterprise catalog.