Tuesday, July 23, 2024
HomeFeatureTech Talk: Embracing Generative AI and getting LLMs to work for you

Tech Talk: Embracing Generative AI and getting LLMs to work for you

Tech Talk: Embracing Generative AI and getting LLMs to work for you

At an event titled “Unveiling the Future of Large Language Models and Generative AI” organised by ASUS and NVIDIA, we spoke to August Chao, Taiwan Web Service Corporation Technology Center Chief Engineer, and Morris Tan, ASUS Country Product Manager, as they shared how they saw the future of generative AI, and the changes it would bring to companies and staff.

We had previously covered the two companies when they spoke to us on the considerations and complications when deploying an Omniverse server.

This time, we spoke about the need for high-quality data in the large language model (LLM) used, and how to ensure that true value is being delivered by an AI deployment. More importantly was how to measure the ROI on an investment in AI against its cost.

How do you determine if you have enough high-quality data for your LLM? What data do we need to start implementing AI? Do we have the data infrastructure and quality in place? Are there any negative consequences if my model doesn’t reach the required accuracies in production?

Businesses can refer to and consult with the LLM design guidelines from TWS when developing their dataset. To develop a data structure or framework for your LLM application, the main factor tends to be the suitability of the data rather than the quantity of the data for your application. To reduce the negative consequences and potential risk that the data could affect your application, it’s best to engage with a qualified LLM consultant and also consider adopting the RAG (Retrieval Augmented Generation) approach to enhance the user experience and trustworthiness of your AI application.

Morris Tan, Country Product Manager for Server at ASUS. Image source: ASUS.

How do you determine the issues an AI deployment is meant to resolve and how do you then approach it? How do you measure if it ultimately adds value to the company?

From feedback received from a diverse base of end-users, most of them focus on leveraging generative AI technology for developing new products and services. Businesses hope that by using generative AI, they can not only fast-track their current business processes but also discover valuable insights into new ones.

We recommend collaborating with a trusted AI solutions partner with the expertise to fully utilise the Generative AI technology and provide training, inferencing and collaborative research and design suggestions. As the Generative AI technology landscape is ever and always evolving, therefore, it is best to partner with a reliable professional, this will ensure your Generative AI application can be successfully implemented to meet the demands of the business.

When it comes to an AI and LLM deployment, is it better to consider a CPU or GPU server approach? Does it make a difference in what servers you train your LLM on? When should a business consider between deploying generative AI and non-generative AI solutions at work?

Especially for AI deployment, having an infrastructure equipped with GPU is crucial but it is not the main factor for it’s development. A proprietary LLM model capable of evolving alongside business growth outweighs the emphasis solely placed on selecting computing accelerators. Generative AI, a proven trend, cannot be adequately substituted by traditional machine learning methods or relying solely on CPU infrastructure.

In its recent report on AI, Stanford’s Human-Centered AI group provided some context. GPU performance “has increased roughly 7,000 times” since 2003 and price per performance is “5,600 times greater,” it reported.

August Chao, Taiwan Web Service Corporation Technology Center Chief Engineer.

Are there any minimum specification requirements needed for an LLM and AI deployment? Is it better to keep the compute server separate from the storage servers for LLM and AI deployment?

There are a few considerations when it comes to specification and hardware requirements such as the size of data for training, inference as well as the LLM applications requirements.

We would recommend having compute and storage in separate physical servers or even in separate locations from an infrastructure and business continuity management point of view.

When you are starting out with your dataset being not so big and comprehensive, you could have the GPU server double as the storage server for cost and simplicity, however as your business grows and the size and importance of your Generative AI models and data for training and inference expand, splitting the task into separate specialised physical machines (using a dedicated storage server) could be a solution to consider as this help to safeguard all your models and data that the GPU server works with.

Company size

Entry

Mid-range

High-end

Model

ESC4000A-E11

ESC4000A-E12

ESC8000A-E12

Dimension

800mm x 440mm x 88.9mm (2U)

800mm x 39.5mm x 88.9mm (2U)

800 x 440 x 174.5mm (4U)

CPU

AMD EPYC 7543 (32C)

AMD EPYC 9554 (64C)

AMD EPYC 9654 *2pcs (192C)

Memory

DDR4 3200 32G*8

DDR5 4800 64G*12

DDR5 4800 64G*24

GPU

NVIDIA RTX A6000 GPU*2pcs

NVIDIA RTX 6000 ADA*4pcs

NVIDIA L40S *8pcs

SSD

NVMe 3.84T*2pcs

NVMe 7.68T*2pcs

NVMe 7.68T*2pcs

How will AI change a company’s marketing, sales and customer service functions? How can it use AI for more personalised and targeted outreach?

Generative AI not only offers diverse insights from unstructured data but also crafts suitable responses tailored to customer needs. AI solution partners have successfully implemented multiple instances of Large Language Model fine-tuning, incorporated domain-specific knowledge, and seamlessly integrating them into existing systems. Through our reliable framework for generative AI applications, partners can deliver personalised services, resulting in superior customer experiences for their users.

Ultimately, do GPU or CPU-based servers lend themselves better to an AI deployment? How do you sustain the costs of running an AI deployment?

While both GPU and CPU-based servers have their merits for AI deployments, GPUs offer superior performance for compute-intensive AI tasks. However, sustaining costs involves a combination of optimising resources, leveraging cloud services, selecting cost-effective hardware, optimising algorithms, and efficient management practices to ensure the best balance between performance and expenditure.

How do you measure what will AI cost and what will the ROI be on these investments?

To measure the cost of implementing AI and estimate the ROI on these investments, a comprehensive evaluation approach is crucial. This involves assessing various expenses involved in AI adoption, which would include hardware, talent acquisition, data procurement, development costs, and ongoing operational expenses. Simultaneously, calculating the return on investment (ROI) entails examining the potential generation of revenue, cost savings, and efficiency enhancements attributed to AI integration.In the context of our transformative approach, it is essential to recognise the ongoing opportunities within the changing business landscape. While waiting to observe and validate reported paradigm shifts in related industries is prudent, there is also the inherent risk of being left behind in a competitive market.

Therefore, a strategic suggestion would be to kickstart and commence the evaluation of embracing generative AI with all due haste.

< PrevPage 1 of 1 – Tech Talk: Embracing Generative AI and getting LLMs to work for youPage 1 of 1 – Tech Talk: Embracing Generative AI and getting LLMs to work for youPage 1 of 1 Page 1 of 1 – Tech Talk: Embracing Generative AI and getting LLMs to work for youNext >

RELATED ARTICLES
- Advertisment -

Most Popular