9 minutes to read With insights from... Silvan Melchior Lead Data Scientist silvan.melchior@zuehlke.com Tomas Dikk Principal Data Scientist tomas.dikk@zuehlke.com Dr. Gabriel Krummenacher Head of Data Science gabriel.krummenacher@zuehlke.com ChatGPT has finally propelled Generative AI into the mainstream, with one million users within a week of launch. By now, everyone who reads or listens to the news has heard of it and almost anyone with a computer has tried it for themselves. It is the most capable of the pre-trained generative large language models, which have progressed with breathtaking speed over the last couple of years. Learn how ChatGPT changes your business. Now the big question is: what does this development mean for your business? Is it possible to leverage this impressive technology for your industry and to generate more revenue or reduce costs? Or will it mainly benefit big tech companies and product start-ups? ChatGPT is just the tip of the large language model iceberg. Below the water lie many valuable use cases that are not merely a form of ‘AI assistant’ but that vary broadly across the whole domain of natural language processing. Not only are use cases now possible that were unthinkable a couple of years ago, but existing solutions have also benefited from the improved performance of new large language models. And they are not restricted to consumer use cases; a variety of business-critical use cases are also possible. The technology is readily available and can be easily integrated as a cloud service. The question is: are you ready? What is ChatGPT? The original concept of large language models such as ChatGPT is to fulfil a simple task: given an incomplete text, they need to predict the next word (or token). This simple objective allows them to be trained on any text by giving them part of the text, asking them to predict the next word and giving them feedback on whether they are correct. Although this idea is not new, it only recently became possible for very large models. This final step required the further development of the models themselves (for example the transformers, the basis of all current language models) and improved algorithms (such as sparse attention, which decreases the computational complexity). Increasingly powerful infrastructure and a lot of engineering experience were of course necessary too. As these models increased in size and were trained on massive amounts of text, interesting behaviour was observed. In order to fulfil their objective, the models acquired broad knowledge and learned a diverse set of skills from the data they were analysing. Models such as GPT-3 were able to predict the solution to mathematical problems by predicting the next word of ‘8+12=’, they were able to translate text by predicting the whole sentence after ‘DE: Das ist ein Satz / EN:’, and they could answer general questions in various different domains. However, learning general purpose text and predicting the next word started to be somewhat limiting in the sense that the models did not always behave in the way expected. For example, if the model was asked a question, it would continue the text with several more questions. The solution to this was usually ‘prompt engineering’, which is the technique of finding a prompt that makes the model reply in an expected way. To overcome this, ChatGPT (and its predecessor InstructGPT use a technique called ‘Reinforcement Learning from Human Feedback’, which collects a large amount of data on how a model should react to a certain query; this human-generated data is then used to adapt the pre-trained model so that it behaves more as described in these examples. All these advancements in models, algorithms, infrastructure and engineering, together with the massive size of the models and their underlying training data, has produced the remarkable models we have today. This allows ChatGPT to understand, complete, summarise, correct, translate and invent any text we can think of – or at least act like it. Emerging business capabilities ChatGPT and other large language models have capabilities that go beyond mere text generation. They can also be used for discriminative tasks such as document retrieval and named entity recognition, thus enabling not only new use cases but also improving existing ones. The integration of proprietary, non-public data is of central importance to the business use of a large language model. Without it, the model will not be able to find data relevant to your company or answer questions about your documents. One promising example of this is the use of ChatGPT to find answers to questions that are specific to your business and based on your documents. For this, the relevant documents need to be processed with the language model to compute text embedding for all paragraphs and stored in a vector database. For a given user question, the closest matching documents can then be prompted to give an answer based only on these. If the user likes the answer, it can also be embedded and stored in the vector database, which allows the system to improve continuously with use. There are many similar use cases that combine the powerful text generation and understanding capabilities of pre-trained large language models by searching and indexing customised business-specific data. For instance, the generation of reports based on draft notes could be solved in a similar way by indexing and storing historical note-report pairs that can then be used as context for ChatGPT to generate a new report for new notes. Thanks to the availability of pre-trained large language models on popular cloud service providers using APIs, it is possible to integrate these models into customised software and data science pipelines. ChatGPT itself will also soon be available as a service from OpenAI and Microsoft Azure. Industry use cases We have identified a broad spectrum of value-generating use cases for generative large language models across industries, from banking and insurance to life sciences and industry and commerce. Explore your industry of interest and discover the range of possible use cases in the table below. Insurance icon Insurance Banking icon Banking Life science icon Life Science & Pharma Medical device icon Medical Device & Healthcare Industrial sector icon Industrial Sector Consumer goods icon Consumer Goods & Commerce Government icon Government & Public Building services icon Building Services Real estate icon Real Estate Transport mobility icon Transport & Mobility Telecommunications icon Telecommunications What are the risks and limitations? As with any powerful tool, there are risks associated with the use of large language models. One of the biggest concerns with large language models is the potential for incorrect output being presented with a high degree of confidence. Consequently, it is not possible to trust any generated statement that cannot be directly verified. Another concern is that the outputs generated by these models can be biased or contain harmful content. This is particularly true when the data used to train the models is not diverse or if it represents a narrow range of perspectives. Although there are some techniques that can be used to mitigate these issues, such as data pre-processing and post-processing, the general problem remains unresolved to date. Therefore, it is important that these models should not be used as a wholesale replacement for knowledge bases or search engines. Instead, they should be seen as enhancements that can help to boost the performance of natural language use cases and that should be used in conjunction with your data and custom-trained machine learning models, as we have shown above. In addition, legal questions of copyright and protection of confidential information remain open, both for training data and model outputs. For example, it is unclear how copyright laws apply to the use of large language models and the outputs they generate. This is an area of ongoing debate and legal action, with open lawsuits against the companies behind models such as GitHub Copilot and Stable Diffusion. Lastly, large language models are open to misuse and abuse, such as the generation of fake news or the false impression that a human is talking. This highlights the importance of responsible use and the need for regulation and guidelines to ensure that these models are used for the betterment of society. Looking to the future As the field of generative AI continues to evolve, its impact on business and society evolves with it. The most straightforward development is the continuous upscaling of large language models. With increased computational power and larger training datasets, these models are becoming more powerful and knowledgeable and able to perform broader sets of tasks with higher accuracy. This might sound like a trivial advancement, but it is the main performance driver of of current models such as ChatGPT, as we outlined in a previous section. In addition to upscaling, another important development is the move towards faster update cycles and continual learning. This means that large language models will be able to adapt to new information and changing context more quickly. Another key area of development is the integration of large language models with the Internet, enabling them to access the web for verifying their outputs and researching information they don’t have. Alongside the more traditional modalities of natural language, images and programming code, there are already early examples of models working with many more types of data, such as slides, spreadsheets and action sequences for productivity tools or even robots. This is an exciting development, as it opens up a wide range of possibilities, including the creation of models that combine these modalities. The popularity of OpenAI’s solution forces other companies such as Google or Meta to react quickly and bring more of their own work to production. This competition will further accelerate progress and help to diversify the landscape. Just this week, Bard, Google's own dialogue-based large language model, went into a closed preview which will open soon. Conclusion Companies that correctly use the possibilities offered by large language models in their businesses can heavily optimise their processes, reduce costs and save time , while generating new revenue. It is important, however, to be aware of the risks and limitations of these models and to combine their capabilities with customised software and proprietary data. As a result, it is not only possible to create new generative use cases, but also to improve existing use cases. Through numerous machine learning projects in high-risk applications such as medical devices, we have learned that the right framework and processes, such as our Medical AI Process or Responsible AI Framework, make it possible to use the latest state-of-the-art technology in a beneficial way. Thanks to our expertise and experience with data and AI solutions, we can help you understand the impact of ChatGPT on your business and support you in the implementation of your own large language model. Our expertise includes utilising GPT-3 for creating a FAQ search system for customer inquiries in a public authority, and providing consultancy services to a large advertising agency on the application of generative AI in creative projects. Footnote: We heavily leveraged ChatGPT to write the first draft of this blogpost – of course. This assessment is purely technical and does not provide any information on possible current/future legal restrictions that may arise when using AI in commercial solutions. Contact person for Switzerland Philipp Morf Head AI & Data Practice Dr. Philipp Morf holds a doctorate in engineering from the Swiss Federal Institute of Technology (ETH) and holds the position head of the Artificial Intelligence (AI) and Machine Learning (ML) Solutions division at Zühlke since 2015. As Director of the AI Solutions Centre, he designs effective AI/ML applications and is a sought-after speaker on AI topics in the area of applications and application trends. With his many years of experience as a consultant in innovation management, he bridges the gap between business, technology and the people who use AI. Contact philipp.morf@zuehlke.com +41 43 216 6588 Your message to us You must have JavaScript enabled to use this form. First Name Surname Email Phone Message Send message Leave this field blank Your message to us Thank you for your message.
Industrial Sector – From paper prototype to finished product. End-to-end development with Zühlke, demonstrated with the example of a digital parking meter. Learn more