Software Engineering Daily
Anaconda is a popular platform for data science, machine learning, and AI. It provides trusted repositories of Python and R packages and has over 35 million users worldwide.
Rob Futrick is the CTO at Anaconda, and he joins the show to talk about the platform, the concept of an OS for AI, and more.
This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments.
Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com.
Please click here to see the transcript of this episode.
Sponsorship inquiries: [email protected]
The post Anaconda and Accelerating AI Development with Rob Futrick appeared first on Software Engineering Daily.
Java is one of the most widely used programming languages, and a key contributor to its success is VMware Tanzu’s Spring, the most common framework for Java development. The Spring Framework is built on top of the Java Virtual Machine and provides a consistent programming and configuration model for application developers. From inception, it was designed with developer experience and modularity in mind.
The open-source application framework has been accelerating Java development times since its inception in 2004 (Happy 20th birthday). Since then, the platform has been expanding: growing 50% year over year during the last five years. In this blog we’re exploring what makes Spring important to Java, how the framework influenced the developer experience, and a look at the latest version of Spring, which introduces features to support AI integration.
Two decades ago, there were dozens of different ways to connect to a database, something just about every application has to do at some point. “At the time, the various approaches were very cumbersome: developers wrote a lot of code and gained very little functionality,” explained Mark Pollack, a Senior Staff Engineer, Tanzu Division, Broadcom. “Spring provided a lot of value by simplifying the process. Just getting a single app to talk to a database, present a web form, and do transactions correctly was a huge win. In that era, developers could spend weeks trying to create that function.”
Another reason for its success is its enterprise-focus. Most open source projects concentrate on the consumer market. However, large companies invest a lot of money building applications to run their businesses. “At the end of the day, large corporations’ largest expense is probably their developers,” explained Ryan Morgan, Senior Director of Engineering, Tanzu Division, Broadcom. Spring makes developer teams more efficient, which greatly enhances the bottom line.
Through the years, the ecosystem has grown. “There’s a large and vibrant community behind Spring,” noted Tanzu’s Morgan. Now, it has more than 200 different technology starters. These software building blocks make it simpler for software engineers to integrate their code with different third-party systems.
Development problems evolve over time, so various elements were added. Spring Initializer is a bootstrapping tool, a way for developers to create a new project. “Normally, software engineers started from a blank piece of paper and had to figure out what type of project it was and what type of libraries were needed,” said Tanzu’s Morgan. Then, they searched the web to find some place in the documentation that told them what library dependency was needed to add for different pieces of functionality. Then invariably, you cut and pasted from something that wasn’t consistent. You ended up with a mess.”
With Spring Initializer, software engineers go to a website which has clear instructions about what the options are and presents them in a typical web form. Then, they enter the Generate command and out comes a shell that they can use to start building their program. The solution does not generate any code but solves the problem of finding the right application dependencies. Developers start faster and are less frustrated than previous methods.
Under development is Spring CLI, which not only creates the shell of the app but also includes code. The advances have a significant impact because 1 million new projects are created each month.
The last 10 years have seen a major move to container deployment and Spring has aligned with this paradigm shift. “Really, when you think about all those cloud native patterns, a lot of those container functions are really baked into our projects already,” stated Tanzu’s Morgan. “If you want to do distributed configuration, we have a solution for that. You want to do service discovery; Spring has patterns and tools for that.”
Recently, a major change to Spring occurred. Rather than release new functions autonomously, they are gathered and bundled into Spring Boot. Version 3.0, which is based on Spring Framework 6.0, requires Java 17 or above. Previously, Spring supported Java 8, so the change is significant for some companies.
Better performance is one benefit from the change. “We’ve seen customers realize 15% performance improvements, just from doing the upgrade,” said Tanzu’s Morgan.
AI is being woven into many applications, especially with the emergence of Generative AI solutions. They represent a quantum leap in capabilities and overall intelligence compared to previous iterations of AI. One reason why today’s generative models are gaining so much attention is that they work with much larger volumes of information (hundreds of billions of words) and larger data models (hundreds of billions of parameters) compared to previous AI systems. They possess impressive and unprecedented power. Consequently, they can perform very sophisticated functions.
However as developers try to take advantage of the functionality, platform diversity again presents development challenges. “OpenAI has their API, Amazon Bedrock offers a different one, and so do other companies,” noted Tanzu’s Pollack.
A guiding focus and design principle in the Spring framework is simplifying such work by providing common abstractions over similar technologies and interfaces. Spring AI is quickly becoming the starting point when Java developers write AI applications. “Spring AI has the common patterns that Spring developers are used to,” noted Tanzu’s Pollack. It can abstract out models, clients, etc. in ways that are familiar to Spring users.”
Another crucial part of AI applications is using a vector database. Spring supports multiple vector databases, and its portable API simplifies changing implementations. So, Spring streamlines AI application development.
Java has been a popular programming language for enterprises for decades. Spring provides software engineers with tools that help them enhance the development process. The framework has reached its 20th year of empowering developers, and its years, engaged community is laying the groundwork for continued expansion in the coming decades. “Maybe one reason why Spring continues to do well is it constantly tries to improve itself and doesn’t just rest on its laurels,” concluded Tanzu’s Pollack.
The post Spring AI and Java in 2024 appeared first on Software Engineering Daily.
Vercel provides a cloud platform to rapidly deploy web projects, and they develop the highly successful Next.js framework. The company recently made headlines when they announced v0 which is a generative AI tool to create React code from text prompts. The generated code uses open-source tools like Tailwind CSS and shadcn/ui.
Lee Robinson is the VP of Product at Vercel. He helps lead the product teams and focuses on developer experience on the platform. He joins the show to talk about Vercel, their AI SDK to easily connect frontend code with LLMs, the v0 AI tool, and more.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .
Please click here to see the transcript of this episode.
Watch the video episode here
Sponsorship inquiries: [email protected]
The post Vercel AI with Lee Robinson appeared first on Software Engineering Daily.
Sean Mullaney is the CTO of Algolia and has worked at Google X, Stripe, and Zolando. He joins the show today to talk about Algolia, neural search, vector compression, search optimization, and more.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.
Please click here to see the transcript of this episode.
Watch the video episode here
Sponsorship inquiries: [email protected]
The post Algolia with Sean Mullaney appeared first on Software Engineering Daily.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .
Please click here for the transcript of this episode.
Watch the video episode here
Sponsorship inquiries: [email protected]
The post JetBrains AI with Jodie Burchell appeared first on Software Engineering Daily.
This episode of Software Engineering Daily is part of our on-site coverage of AWS re:Invent 2023, which took place from November 27th through December 1st in Las Vegas.
In today’s interview, host Jordi Mon Companys speaks with Ankur Mehrotra who is the Director and GM of Amazon SageMaker.
Jordi Mon Companys is a product manager and marketer that specializes in software delivery, developer experience, cloud native and open source. He has developed his career at companies like GitLab, Weaveworks, Harness and other platform and devtool providers. His interests range from software supply chain security to open source innovation. You can reach out to him on Twitter at @jordimonpmm.
Please click here to see the transcript of this episode.
Sponsorship inquiries: [email protected]
The post AWS re:Invent Special: Sagemaker with Ankur Mehrotra appeared first on Software Engineering Daily.
An embedding is a concept in machine learning that refers to a particular representation of text, images, audio, or other information. Embeddings are designed to make data consumable by ML models.
However, storing embeddings presents a challenge to traditional databases. Vector databases are designed to solve this problem.
Pinecone has developed one of the most prominent vector databases that is widely used for ML and AI applications.Marek Galovic is a software engineer at Pinecone and works on the core database team. He joins the podcast today to talk about how vector embeddings are created, engineering a vector database, unsolved challenges in the space, and more.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.
Please click here to see the transcript of this episode.
Watch the video episode here
Sponsorship inquiries: [email protected]
The post Pinecone Vector Database with Marek Galovic appeared first on Software Engineering Daily.
Vespa is a fully featured search engine and vector database, and it has integrated ML model inference. The project open sourced in 2017, and since then has grown to become a prominent platform for applying AI to big data sets at serving time.
Vespa began as a project to solve Yahoo’s use cases in search, recommendation, and ad serving. The company made headlines in October when they announced they’re spinning Vespa.ai out of Yahoo as a separate company.
Jon Bratseth is the CEO at Vespa and he joins the show to talk about large language models, retrieval augmented generation, or RAG, vector database engineering, and more. Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .Please click here to see the transcript of this episode.
Sponsorship inquiries:[email protected]
The post Vespa.ai with Jon Bratseth appeared first on Software Engineering Daily.
GitHub Copilot is an AI tool developed by GitHub and OpenAI to assist software developers by autocompleting code. Copilot kicked off a revolution in software engineering, and AI assistants are now considered essential tools to many developers.
Joseph Katsioloudes is a cyber security specialist and works at the GitHub Security Lab. He joins the show today to talk about Copilot, the future of software development in an AI world, using AI to improve security, and more.
Check out Joseph’s bio and the Secure Code Game which is an in-repo learning experience that Joseph created to teach how to secure vulnerable code.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information covisualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You ca connect with Sean on Twitter @seanfalconer .Please click here to see the transcript of this episode.
Sponsorship inquiries:[email protected]
Watch the video episode here.
The post GitHub Copilot with Joseph Katsioloudes appeared first on Software Engineering Daily.
On a recent trip to my hometown in Eastern Canada, my father picked me up at the airport. One of the first things he asked me was, “Is AI going to take everyone’s jobs?”.
When AI, generative AI, and large language models (LLM) have become topics of conversation within the senior citizen community of rural Canada, you know it’s on everyone’s minds. Generative AI, and especially the use of LLMs, is the “new new thing”. It dominates my X (i.e. Twitter) feed and nearly every conversation I have about technology.
There’s justifiably a ton of excitement about the power of generative AI, reminiscent of the introduction of the Internet or the first smartphone. Generative AI is poised to transform how we build products, design drugs, write content, and interact with technology. But as the utilization of AI grows, many governments and companies have raised concerns about the privacy and compliance issues that adopters of these technologies face.
The core challenge posed by generative AI right now is that unlike conventional applications, LLMs have no “delete” button. There’s no straightforward mechanism to “unlearn” specific information, no equivalent to deleting a row in your database’s user table. In a world where the “right to be forgotten” is central to many privacy regulations, using LLMs presents some difficult challenges.
So what does all this mean for businesses that are building new AI-powered applications or AI models?
In this post, we’ll explore this question and attempt to provide answers. We’ll examine the potential impact of generative AI, ongoing compliance hurdles, and a variety of privacy strategies. Finally, we’ll examine a novel approach grounded in the IEEE’s recommended architecture for securely storing, managing, and utilizing sensitive customer PII (Personally Identifiable Information)—the data privacy vault.
Imagine the following scenario: You’ve just copied and pasted sensitive contract details into an LLM to get some quick assistance with routine contract due diligence. The LLM serves its purpose, but here’s the catch: depending on how it’s configured, that confidential contract data might linger within the LLM, accessible to other users. Deleting it isn’t an option, predicting its future use—or misuse—becomes a daunting task, and retraining the LLM to “roll it back” to its state before you shared those sensitive contract details can be prohibitively expensive.
The only foolproof solution?
Keep sensitive data far away from LLMs.
Sensitive information, including internal company project names, core intellectual property, or personal data like birthdates, social security numbers, and healthcare records, can inadvertently find its way into LLMs in several ways:
AI data privacy is a formidable challenge for any company interested in investing in generative AI technology. Recent temporary bans of ChatGPT in Italy and by companies like Samsung have pushed these concerns to the forefront for businesses looking to invest in generative AI.
Even outside of generative AI, there are increasing concerns about protecting data privacy. Meta was recently fined $1.3 billion by the European Union (EU) for its non-compliant transfers of sensitive data to the U.S. And this isn’t just an issue for companies doing business in the EU.
There are now more than 100 countries with some form of privacy regulation in place. Each country’s privacy regulations include unique and nuanced requirements that place a variety of restrictions on the use and handling of sensitive data. The most common restrictions relate to cross-border data transfers, where sensitive data can be stored, and to individual data subject rights such as the “right to be forgotten.”
One of the biggest shortcomings of LLMs is their inability to selectively delete or “unlearn” specific data points, such as an individual’s name or date of birth. This limitation presents significant risks for businesses leveraging these systems.
For example, privacy regulations in Europe, Argentina, and the Philippines (just to name a few) all support an individual’s “right to be forgotten.” This grants individuals the right to have their personal information removed or erased from a system. Without an LLM delete button, there’s no way for a business to address such a request without retraining their LLM from scratch.
Consider the European Union’s General Data Protection Regulation (GDPR), which grants individuals the right to access, rectify, and erase their personal data—a task that becomes daunting if that data is embedded within an LLM. GDPR also empowers individuals with the right to object to automated decision-making, further complicating compliance for companies that use LLMs.
Data localization requirements pose another challenge for users of LLMs. These requirements pertain to the physical location where customer data is stored. Different countries and regions have precise laws dictating how customer data should be handled, processed, stored, and safeguarded. This poses a significant challenge when using an LLM used for a company’s global customer base.
Data Subject Access Requests (DSARs) under GDPR and other laws add another layer of complexity. In the EU and California, individuals (i.e., “data subjects”) have the right to request access to their personal data, but complying with such requests proves challenging if that data has been processed by LLMs.
Considering the intricate privacy and compliance landscape and the complexity of LLMs, the most practical approach to maintaining compliance is to prevent sensitive data from entering the model altogether. By implementing stringent data handling practices, businesses can mitigate the privacy risks associated with LLMs, while also maintaining the utility of the model. Many companies have already decided that the risks are too high, so they’ve banned the use of ChatGPT, but this approach is shortsighted. Properly managed, these models can create a lot of value.
To address the privacy challenges associated with generative AI models, there have been a few proposals such as banning or controlling access, using synthetic data instead of real data, and running private LLMs.
Banning ChatGPT and other generative AI systems isn’t an effective long-term strategy, and these other “band aid” approaches are bound to fail as people can find easy workarounds. Using synthetic data replaces sensitive information with similar-looking but non-sensitive data and keeps PII out of the model, but at the cost of losing the value that motivated you to share sensitive data with the LLM in the first place. The model loses context, and there’s no referential integrity between the synthetically generated data and the original sensitive information.
The most popular approach to addressing AI data privacy, and the one that’s being promoted by cloud providers like Google, Microsoft, AWS, and Snowflake, is to run your LLM privately on their infrastructure.
For example, with Snowflake’s Snowpark Model Registry, you can take an open source LLM and run it within a container service in your Snowflake account. They state that this allows you to train the LLM using your proprietary data.
Snowpark Model Registry and Container Service (Source: Snowflake Blog)However, there are several drawbacks to using this approach.
Outside of privacy concerns, if you’re choosing to run an LLM privately rather than take advantage of an existing managed service, then you’re stuck with managing the updates, and possibly the infrastructure, yourself. It’s also going to be much more expensive to run an LLM privately. Taken together, these drawbacks mean running a private LLM likely doesn’t make sense for most companies.
But the bigger issue is that, from a privacy standpoint, private LLMs simply don’t provide effective data privacy. Private LLMs give you model isolation, but they don’t provide data governance in the form of fine-grained access controls: any user who can access the private LLM can access all of the data that it contains. Data privacy is about giving a user control over their data, but private LLMs still suffer from all of the intrinsic limitations around data deletion that are blocking the adoption of public LLMs.
What matters to a business—and individual data subjects—is who sees what, when, where, and for how long. Using a private LLM doesn’t give you the ability to make sure that Susie in accounting sees one type of LLM response based on her job title while Bob in customer support sees something else.
So how can we prevent PII and other sensitive data from entering an LLM, but also support data governance so we can control who can see what and support the need to delete sensitive data?
In the world of traditional data management, an increasingly popular approach to protecting the privacy of sensitive data is through the use of a data privacy vault. A data privacy vault isolates, protects, and governs sensitive customer data while facilitating region-specific compliance with laws like GDPR through data localization.
With a vault architecture, sensitive data is stored in your vault, isolated outside of your existing systems. Isolation helps ensure the integrity and security of sensitive data, and simplifies the regionalization of this data. De-identified data that serve as references to the sensitive data are stored in traditional cloud storage and downstream services.
De-identification happens through a tokenization process. This is not the same as LLM tokenization, that has to do with splitting texts into smaller units. With data de-identification, tokenization is a non-algorithmic approach to data obfuscation that swaps sensitive data for tokens. A token is a pointer that lets you reference something somewhere else while providing obfuscation.Traditional data management versus a data privacy vault architec
Traditional data management versus a data privacy vault architectureLet’s look at a simple example. In the workflow below a phone number is collected by a front end application. The phone number, along with any other PII, is stored securely in the vault, which is isolated outside of your company’s existing infrastructure. In exchange, the vault generates a de-identified representation of the phone number (e.g. ABC123). The de-identified (or tokenized) data has no mathematical connection with the original data, so it can’t be reverse engineered.
Any downstream services—application databases, data warehouse, analytics, any logs, etc.—store only a token representation of the data, and are removed from the scope of compliance:
Example of a data privacy vault in actionAdditionally, a data privacy vault can store sensitive data in a specific geographic location, and tightly control access to this data. Other systems, including LLMs, only have access to non-sensitive de-identified data.
The vault not only stores and generates de-identified data, but it tightly controls access to sensitive data through a zero trust model where no user account or process has access to data unless it’s granted by explicit access control policies. These policies are built from the bottom, granting access to specific columns and rows of PII. This allows you to control who sees what, when, where, for how long, and in what format.
For example, let’s say we have a vault containing customer records with columns defined for a customer’s name, social security number (SSN), date of birth (DOB), and email. In our application we want to support two types of users: support and marketing.
Support doesn’t need to know the exact details about a customer, they only need masked data so they can speak to the customer by name and verify their identity using the last four digits of the customer’s SSN. We can create a policy for the role support that grants access to only the limited view of the data.
ALLOW READ ON users.full_name, users.ssn, users.email WITH REDACTION = MASKED ALLOW READ ON users.dob WITH REDACTION = REDACTEDSimilarly, a marketing person needs someone’s name and email, but they don’t need the customer’s SSN or need to know how old someone is.
ALLOW READ ON users.full_name, users.email WITH REDACTION = PLAIN_TEXT ALLOW READ ON users.dob WITH REDACTION = MASKED ALLOW READ ON users.ssn WITH REDACTION = REDACTEDWith roles and policies similar to ones above in place, the same de-identified data is exchanged with the vault. Based on the role and associated access control policies for the caller, different views of the same sensitive data can be supported.
Different views of sensitive data based on role.Companies can address privacy and compliance concerns with LLMs with a similar application of the data privacy vault architectural pattern. A data privacy vault prevents the leakage of sensitive data into LLMs, addressing privacy concerns around LLM training and inference.
Because data privacy vaults use modern privacy-enhancing technologies like polymorphic encryption and tokenization, sensitive data can be de-identified in a way that preserves referential integrity. This means that responses from an LLM containing de-identified data can be re-identified based on zero trust policies defined in the vault that let you make sure that only the right information is shared with the LLM user. This lets you make sure Susie in accounting only sees what she should have access to (i.e., account numbers and invoice amounts) while Bob in customer support sees only what he needs to do his job.
To preserve privacy during model training, the data privacy vault sits at the head of your training pipeline. Training data that might include sensitive and non-sensitive data goes to the data privacy vault first. The vault detects the sensitive data, stores it within the vault, and replaces it with de-identified data. The resulting dataset is de-identified and safe to share with an LLM.
Model Training Pipeline with a Data Privacy VaultAn LLM doesn’t care whether my name, Sean Falconer, is part of the training data or some consistently generated representation of my name (such as “dak5lhf9w”) is part of the training data. Eventually, it’s just a vector.
Sensitive data may also enter a model during inference. In the example below, a prompt is created asking for a summary of a will. The vault detects the sensitive information, de-identifies it, and shares a non-sensitive version of the prompt with the LLM.
Since the LLM was trained on non-sensitive and de-identified data, inference can be carried out as normal.
Example of de-identifying inference data with an LLM and a data privacy vaultOn egress from the LLM, the response is passed through the data privacy vault for re-identification. Any de-identified data will be re-identified assuming the end-user has the right to see the information, according to explicit access control policies configured in the vault.
From a privacy and compliance standpoint, using a data privacy vault means that no sensitive data is ever shared with an LLM, so it remains outside of the scope of compliance. Data residency, DSARs, and delete requests are now the responsibility of a data privacy vault that’s designed to handle these requirements and workflows.
Incorporating the vault into the model training and inference pipelines allows you to combine the best of modern sensitive data management with any LLM stack, private, public, or proprietary.
As every company gradually morphs into an AI company, it’s critically important to face data privacy challenges head-on. Without a concrete solution to data privacy requirements, businesses risk remaining stuck indefinitely in the “demo” or “proof-of-concept” phase. The fusion of data privacy vaults and generative AI offers a promising path forward, freeing businesses to harness the power of AI without compromising on privacy.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.The post Privacy in the Age of Generative AI appeared first on Software Engineering Daily.
Machine learning model research requires running expensive, long-running experiments where even a slight mis-calibration can cost millions of dollars in underutilized compute resources. Once trained, model deployment, production monitoring, and observability requirements all present unique operational challenges.
Chris Van Pelt is the Chief Information Officer of Weights and Biases, which is the industry standard in experiment monitoring and visualization, and has expanded that expertise into a comprehensive suite of ML Ops tooling including model management, deployment, and monitoring.
Chris joins us today to discuss the state of the machine learning ecosystem at large, as well as some of their more recent work around production LLM tracing and monitoring.Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .
Please click here to see the transcript of this episode.
Sponsorship inquiries: [email protected]
The post Weights & Biases with Chris Van Pelt appeared first on Software Engineering Daily.
Your feedback is valuable to us. Should you encounter any bugs, glitches, lack of functionality or other problems, please email us on [email protected] or join Moon.FM Telegram Group where you can talk directly to the dev team who are happy to answer any queries.