At Presence, we prioritize client data security while aiming to maximize productivity and efficiency. Handling sensitive data and executing high-stakes projects for our national brand clients, we understand the absolute necessity of reliable, secure, and compliant solutions in the work we deliver. Our recent internal presentation at Global Tech Week, hosted by our parent company, Work & Co., highlighted our approach to OpenAI's GitHub Copilot and ChatGPT, bringing to focus the need for tools that balance data security with innovative capabilities.
Why Prioritize Internal AI Tooling?
While acknowledging the merits of AI tools like ChatGPT and GitHub Copilot, concerns around potential client data exposure have driven us towards in-house alternatives. We believe self-hosted tools within our secure infrastructure can offer similar benefits while ensuring data privacy. As this space continues to change quickly, we aim to innovate internally so we can apply our learnings to the work we do for our clients.
Presence's AI Toolkit
We established our own internal deployment of conversational large language models, functioning like ChatGPT. However, this model runs on Presence's infrastructure, eradicating data retention concerns. This includes a UI-based tool allowing direct interaction with the language model, and an API interface is available for experimental code/scripting work.
Additionally, we introduced two code completion models, StarCoder and SantaCoder. These models, with user interfaces and API endpoints, offer IDE-integrated code completion similar to GitHub Copilot. StarCoder, trained on 80+ programming languages offers an impressive 8000 token context window, far larger than most open-source LLMs. On the other hand, SantaCoder, trained on Python, Java, and JavaScript, offers quicker runtime.
Quantization: A Key Player
Quantization plays a significant role in fitting large models onto limited GPU memory. It involves reducing the bit size of the floating point numbers representing model parameters, thereby minimizing memory requirement while retaining performance. This technique allowed us to deploy models like StarCoder and SantaCoder on our instances, effectively reducing the memory needed.
Conclusion: Innovation Via Secure Internal Tooling
Our strategic choice to develop in-house solutions springs from a well-balanced vision - the drive for tech-enhanced efficiency while retaining an uncompromising stance on data security. We value the immense capabilities AI tools can bring to our operations, and by developing them internally, we can uphold our commitment to protecting client data. The result is an AI infrastructure encapsulating the best of both worlds: ground-breaking innovation and data privacy. This nuanced approach allows us to reap the benefits of AI innovations, all while ensuring that security remains embedded within our technological DNA. Our ongoing projects are a testament to the potential and power of this approach. As we advance, we continue to welcome discussions and queries surrounding these developments, underpinning our commitment to transparency and progress.