Conf42 Open Source Showcase 2020 - Online

- premiere 5PM GMT

CryptPad: The Encrypted Collaboration Suite

Video size:

Abstract

As users and organizations we are more and more relying on technology for everything we do, whether it is our personal usage or our business usage. How much do we need to trust our technology providers to handle our data with care ?

Is there any way we can be more in control of our data ?

The CryptPad project is being build based on a new approach, end-to-end encryption, raising the bar of data security and privacy, while providing a high usability and many collaboration features, including real-time collaboration.

Summary

  • Ludovic Dubas is the CEO of Xwiki SAS, an enterprise wiki. Cryptpad is an encrypted collaboration suite. Dubas says the software is guided by privacy and security principles. Handling real time collaboration is the first challenge of this type of technology.
  • In Cryptpad, we're doing an authentication that is based on cryptography. All these documents are encrypted, including the data, the titles, the metadata tags. The safest way and more secure way to share a pad in cryptpad is actually having the other person in your contact.
  • cryptpad uses a unique key to secure documents. All documents are encrypted using the key of the document. Algorithms are used to handle concurrent changes without the server being involved. More advanced applications could be built on top of crypt storage.
  • Most of our roadmap is built today on the fundings we were able to get for the project. Right now we have received funding from NLNeT and trust. Since the cryptpad crisis, we've reached about 50,000 users per week and 350,000 pads open in a week. We need to continue to sustain the growth of usage.
  • cryptalfr is a pain service, and for us it's really important to build it right away as a paid service. As users have increased, we've increased subscribers, and we've reached now 170 subscribers. Still ten weeks away from being able to completely fund the team only based on revenue.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello, this is Ludovic Debas. I'm very happy to participate to conf 42 to present Cryptpad, the encrypted collaboration suite. So who am I? I'm Ludovic Dubas, CEO of Xwiki SAS. I'm the creator of Xwiki, which is an enterprise wiki. It's wiki sis is a company based in France and Romania, which has been doing 15 years of open source. We're 40 employees at Xwiki and we could set a motto as make a living and contribute. Xwiki SAS has launched CryptPad as a new tool four years ago. So why actually does cryptpad exist? So first, when it comes to privacy, what we see now on many websites is we value your privacy. This is what most websites are telling us. And unfortunately, when you look and go a bit deeper, look at the privacy policies and all the things that is happening with our data, what they actually mean is that they use and sell your data for many things, either for advertisement, either to sell our contacts or to understand what exactly we're doing so that they can sell us more services. If we look a bit deeper and we look at the big tech. So Google, Facebook, Apple, Twitter, Amazon, Microsoft, what we see is what do they actually know about us? And this has actually a very long list. They know everything about our identity, but they mostly also know a lot of things about us, our friends, the data we share on their systems and everything we do, including some things that we don't do on their own website. This is, for example, the example of Facebook that knows even what we're doing outside of Facebook. And this list goes on and goes on. So actually I took this list from security baron, which made this list and compares a bit everything that the big providers know about us. That's the first problem. And the second problem is generally more what about security? What we see is that we're using more and more the Internet, including for our work. We used to have most of our work on our personal computer or inside the networks of companies, and now we're having everything on the Internet, on the cloud. And most of this data is unencrypted everywhere. The transparency of what's happening with this data and how it's being protected or handled is actually quite low. There is a lot of talking, but if we actually want to know what's happening, it's almost impossible. And of course we could also go back to some sort of self hosting or use providers that are more ethical. Now, the difficulty is that it's not actually easy for small actors or even yourself to secure your own data. Like if you're running your own server, it requires quite some experience to actually secure it. Why did we actually do quippad? And so the question that we ask ourselves is, what can we actually do to actually enforce users privacy and security using encryption? Is it actually possible to build a collaboration software that take this as a key principle? And this is how we ended up building Cryptpad as an alternative to collaboration tools. Xwiki SAS is a company rooted in collaboration, and so this is the type of things we do at Xriki, like helping people collaborate on. So we decided to build a collaboration tool that is guided by privacy and security principles. And the correlation of this is that there is no business model based on the user's data in what we do. So what are the key principles of Cryptpad? We start by creating encrypted shared documents that can be edited in real time. So this actually comes from the history of cryptpad. We were interested in real time collaboration. How can you edit documents at the same time for the Quickie software? And we realized that we were able to actually encrypt that real time collaboration and use the server as just storage of encrypted data, while the handling of the real time collaboration was happening entirely in the browser. So we decided to actually make a software that would just work with the encrypted data. The second key principle is that there is a management of keys. Encrypted is the first challenge, and handling real time collaboration is a challenge. But the second challenge of this type of technology is how do you actually handle the keys, secure them and share them with other users. And so we're handling these keys in personal, shared, or team drives. So every user has a personal drive which is itself an encrypted document, which is protected by your username and password, for which a key is created. And the personal drive is stored on the server and will contain encrypted all the other references to the other pads or documents that you have, and the keys are associated to them. But we also have shared and team drive. And so what's interesting there is that while a shared and team drive is a drive that is accessible by multiple users, so whatever you store in that drive pad and keys will be accessible to the other users. So once you have a shared or team drive between users, you're not only sharing the documents, you're also sharing the keys of these documents. And the third key principle is that we have a system to exchange keys using personal messaging boxes and this is using public privacy key cryptography. So if you need to send a document to another user, once you've been in contact with that user, you have his public key, and so you can send him a message, and he will receive the message in his crip pad, and we'll be able to decrypt it. In this message, there would be the document address, the pad, and the key associated to it. So what do we actually know about you when you're using Tripad? So there's a few categories, and we have a document on our blog. You can go see our blog on this subject. And so there's things we cannot avoid to see, but we actually don't collect. And so this is ips and public keys. So we do see the ips of our users and what documents they actually request, and we see the public key also associated to this ip. Now, we're not storing that connection, but ultimately, if we wanted to do it as a hoster of Pippad, this is what we could do. Then there's other things that we store because we actually need it. So we store the encrypted files, and these encrypted files are linked to the public keys to the users. We also have the information of the user's identity when they're paying users, because this is actually required by law. And we also have statistical information. So this is not information that we necessarily would have to store first. This is information that we do see the ips, the actions that people do on our servers, and also information that users have allowed us to access. So, for example, we have a telemetry setting where you can accept or refuse to send us the actions that you're doing in the tool, so that no content is actually shared at that point, only the actions and this statistical information, including locations. And we're very interested in that information because it's information that allows us to understand how is cryptpad progressing and how is it progressing in different countries. And we could in the future record that information less. But it's actually quite useful for making prepad a success at this point. And of course, if you host your cryptpad yourself, then the hoster of that cryptpad becomes the one that has access to this information and decides what he stores or not. Now, it could be possible to potentially have even less access to this information, but it's quite difficult, it's much more difficult to not have access to users ips than that. So this is things that maybe in the future could be, can be continuously improved. Now, the thing is, what we cannot know. So the first thing is we don't know your password. We don't even know your username. So this is actually something that's quite interesting, is that most websites, one of the biggest problem is actually that you're sending your password to the websites for verification, while in Cryptpad, where we're doing an authentication that is based on cryptography and where your username and passwords are never leaving your computer. So we're deriving key, and this key represents data on our server. And the second part of what we can't know is the actual content of the documents that you're storing on cryptpad. All these documents are encrypted, including the data, the titles, the metadata tags or all this information. And this is actually protecting your data way more than potentially some collaboration software which encrypted the content, like the text that you could find in some data structures, but would keep the structured data around this data as non encrypted. And so in cryptpad we've went in great lengths to actually have as little information as possible on the content that people collaboration on. And so when you're sharing your pad, we don't know the name of the collaborator that you're sharing it with. Now what do we actually have in Cryptpad and what care the collaboration tools that exist. So I'll show a demo of these. So we started with a rich text pad, which is a WYSIWYg document that you collaborate on in real time. And we have a code pad where you can do markdown and also include some specific markup like mermaid, to do some graphs. We also have a presentation pad which is again in markdown, but allows to present as slides. We have sheets which is based on the integration of only office inside Quickpad. So it's Excel compatible. You can export and import an Excel file and it has quite good set of features of Excel, similar to Excel. We have a kanban, so similar to Trello, where you can organize some tasks in the Kanban form. We have a whiteboard and we have a poll tool. And all this is organized in a trip drive, so you can organize your content and add attached files, pdfs or any types of file you want. And we have a team feature which allows to share such a drive and including a chat with a group of people that you decide. So let's actually look a bit at these features. And so if I go to cryptrive here, here you can see my drive with folders. So I can actually organize this in folders, this specific drive in my folder list is actually a shared drive. We can recognize it using the little icon. And so that means that I can actually share that folder with other users. So I can actually do that using the share button. I can share by knowing the people and sending them a notification in cryptpad on this. But I can also fabricate a link for that access, either in view mode or in edit mode. So when you're sharing a URL which gives access to paths, you have to be careful to share it in a secure channel because anybody that could read that communication channel could access to the information. The key point here is the safest way and more secure way to share a pad in cryptpad is actually having the other person in your contact and share the pad using the account that the person has on quipad. So you have folders and we can actually also attach images. So I can drag and drop files in this directory. As I can see, I have some images there and then I have some pads. So for our most used pad is actually the Wysiwyk pad. So let me actually take the example here. Let's see if I have an example. So this is a sample documents that we use to work to show the common features. So this is our wysIWyg editor. So you can type content here, and so if you have two users opening that pad. So for example here, if I open that pad in a second window, I can make changes here and these changes will show up on the other screen. And this wysIwyg has quite a lot of features. You can insert images here, you can see an image and I can resize the image. I can also use bullet points, I can put equations in the document, so I can chat around the pad, and I can also comment the pad. So if I select some content here, I can add a comment, I will add a comment here. And this is actually collaborative because I can mention some other users and send them a message telling them that I'd like them to do something in this path. So there's quite a lot of features. We have a history feature, so you can roll back in case of difficulty, you can export import, you can print, and this way, for example, extract your pad as pdf. Another type of pad that we have is actually the code pad. So you're typing in markdown and you're seeing the content directly there. What's interesting there is that you also have color, auto colors, so you can actually see who has typed what. So if you have multiple users in your pad, and this is similar to a feature that exists in Etipad is that you can see who has added content in a pad. And this pad is showing mermaid syntax, which allows to do some graphs, including Gantt charts and things like that. So we have also syntax coloring for other languages and markdown, so you can actually see colors also of the syntax so that you understand what you're typing. Another type of pad is actually a whiteboard. So this is actually quite useful for education. You can work in real time on, on a drawing and show it to somebody else during a video conference, for example. Another type of pad is spreadsheets. And so this has been quite challenging and is actually for us a breakthrough is that we're integrating the only office open source software which is built in JavaScript, and we're integrating this encrypt pad. And everything that is happening on the document is encrypted and stored encrypted on the server and even including images. So it's possible to add images itself to your spreadsheet and they will be stored in the trip drive and embedded in your spreadsheet documents. And we also have an import export feature. So if I do export here, I can actually export in Excel. And so this is actually using webassembly because this was C code in onlyoffice to convert the document from the internal format of onlyoffice to mid cell format. Another aspect here is that it's a spreadsheet. And for example it supports some graphs, so you can see a chart based on the spreadsheet data. Let me see if I have some other things to show here on some examples. So you can see we have a pole and we have a kanban. I will actually show this here going back in the presentation in here we have so the homepage of cryptpad that you can see, and you can see that we have ended our storage limit on cryptpad Fr, which is our hosted cryptpad that we provide to the community, which also has a paying subscription. We have extended the free subscriptions to 1gb during the COVID outbreak, as many more users were actually needing online tools. And so this is the drive again. So this is the sharing mechanism. I showed that the code pad, the Kanban, you can see this is actually our squad Kanban, which is showing the tasks on which the team is working on. You can see that it's quite extensive roadmap, lots of work there. Now I'll detail a bit what is planned in the future and the spreadsheets, the whiteboard. So now after this demo of the live tool, what I'd like to show is a bit the technological aspects of cryptpad. So the first item of the technological aspects of crypt is user authentication. So when you log in in cryptPad, you have a username and password, and use these username and passwords never leave the user's computer. We're using script to derive a key, and this key will represent data on our server. And this data is unique. This allows to actually bootstrap your storage space on the Cryptpad server and then will contain the keys of the different documents that you're sharing. It's important to understand that if you lose your password or even your username, we, the hosters of Cryptpad instance at Cryptpad Fr, we are unable to retrieve it. So it's really your job to secure this properly and make sure you're not losing them. Now, when it comes to the documents, what is happening is that every time a document is changing in your browser, then we're creating a patch, and this patch is sent to the server, encrypted using the key of the document. And we have the changepad algorithm and also some other algorithms which are used to handle concurrent changes without the server being involved. So this algorithm, the merging algorithm, is making sure that everybody gets to the same results, even if they are concurrent patches that would be incompatible. So there is not 100% guarantee that your change is going to make it through if you're in a collaboration session. But what is 100% guaranteed is that everybody will be at the same result in the end. So there is only one possible outcome of the collaboration. So for example, if you have a user that makes a change on a paragraph, and another user that deletes it, the cryptpad algorithm will choose a path that will be the same on all clients. And so either it will be considered that the change was done before the deletion, either it will be considered that deletion is done before the change, and in the end you'll get the same result. Another aspect is that we're storing old documents in quipad, including your drive, as a history of patches. Now, in order to avoid coming back to the beginning of every document, so that we can reconstruct the current state, we store a full version every 50 patches. So if you're working on a document, the general principle is that we're sending patches. So if you're reloading a document from scratch, what we will look for is the latest checkpoint and then the patches, and we will reconstruct the state of your document at the end of the last patches, then all document encryption keys are stored in your drive, and the drive is a cryptpad document itself, which is protected using the same mechanism. What is particular in the technology of Quipad is that all the editors are fully written in JavaScript, and we have no server component for any of them. And so everything's running in your browser and needs to run in your browser so that we can secure the collaboration. Now the question is, how far can this go? Like what can we do with this technology beyond what it's doing today? So I've shown in the live demo that we're already doing quite a lot of things well. We can go much further in terms of integrating any editors that are building JavaScript. We've already done some prototypes in the past to integrate draw IO for example, and we also have prototypes to integrate the other components of onlyoffice for presentation. So PowerPoint compatible and word compatible. These actually are quite close from working, but we didn't want to launch them because the more pads we have, the more support we need to do to make sure that these all work very well. And so what we want is first that spreadsheets is working very well, that it can scale to many users, and then we'll potentially deploy more editors. There could also be contributors that work and support editors that could run on the same platform. There is also the possibility to build more advanced applications that are built on top of the crypt storage. So we have focused on one advanced application which is managing your drive, because this is actually something we absolutely needed to secure the collaboration on the pad. But you could also imagine calendars, blogs, wikis, databases that are built on this encrypted storage, or surveys, and we plan to work on some of them in the future. But this approach could be used for any type of application. But you need to think differently when you build them, because you need to build them with the constraint that everything will happen on the client side and nothing will be done by the server, which is not really the way most applications have been built in the recent years, where actually all the providers are trying to build applications that they control on which users are dependent. Another type of things we could do is, and we've done some experiments on this, is encrypted audio video conferencing, and we find it interesting because it's very interesting to add at least audio conferencing around the document. When you're working collaborative in document, it's interesting to be able to talk to each other. And actually the cripat system can transfer also audio and video data, and the browser could implement, playing it. Another aspect is, well, we can go very far in terms of what we can build and new things we can build. Now, the thing is, there is a lot of work already to bring all editors and applications on par with non encrypted applications. If we look at what we do in cryptPad, we have a lot of applications like Wiziwig editing, Markdown editing, only office compatible editing. We also have a kanban, which potentially is a competitor to Trello. And so all these applications, they compete with tools that have a very large range of features, and some of them are not so easy to build. And so there is a lot of work to bring these editors on par with non encrypted applications. So we need to also choose between the number of editors and the quantity of features in the editor. Another aspect is that users are also very interested in mobile and offline access to these documents. So this is a bit the difference between Google Drive, Google Docs and Dropbox. People are also interested to replicate their data on their computer and to edit it, potentially offline. And so there is also some range of work that is significant to write mobile and desktop clients for Quickpack. There is also advanced search that will be an interesting challenge to build. And another aspect is that it's possible to make cryptpad a decentralized service. And so this is also an axis of work, and where basically cryptpad instances would collaborate, and you could have a user on one cryptpad instance that is working collaboratively on a document with a user from another instance, and that's another access of work for the team. Now, what is actually our roadmap? So most of our roadmap is built today on the fundings we were able to get for the project. I'll mention a bit that funding later. And so right now we have received funding from NLNeT and trust. And so we're very grateful of that funding, which is allowing to fund a very active roadmap. And so most of the developments that we're doing care based on the roadmaps, on the developments we have proposed to these projects. And so right now we're finishing up the communities project funded by NLNEt, which already has funded the teams feature in. So we're finishing the project and we're finishing implementing document review. So the comments that I've shown in the Wysiwave pad have been funded by this project. We also have improved in the administration panel, and we're also improving documentation for users and instance administrators. Then the second project we will be working on is called SMC. Secure mobile communication. And the objective there is to develop a prototype Android application. And this should also open a lot of things because relying there is work about making prepad more modular. The third project is a dialogue project, which we plan to do in the second part of this year. And the objective will be to improve the current poll application and implement a form application. So what we will implement in the dialogue project will also open new possibilities in terms of applications built on top of the crypt storage. So we'll create some APIs that can be used by even more complex applications in parallel. We're always working on maintenance and performance. This actually has been super active in the last six months, as I will show with the usage of cryptpad, which has grown a lot, and particularly because of the COVID crisis, we have done a lot of work on performance and we need to continue to sustain the growth of usage, in particular of the cryptpad fr instance, so that we can get more usage. And so if we look at the Cryptpad fr usage, so we have about 450 installs in the world, including Cryptpad is our main instance that is managed by our team. And this instance has seen a tremendous growth when the COVID crisis started because of in particular two types of users. One is people working from home, so needing more collaboration tool, but also the education space and schools which needed ways to collaborate with students. And we've been able to see on Cryptpad that we had a lot of features that were using Cryptpad. Another aspect of cryptpad usage is that Cryptpad is actually heavily used in Germany, which is the first country where it's being used. So this is actually not where the Cryptpad team originates from. The Cryptpad team is, is actually from a french company based in Paris. So France is also a big country of usage, but the german usage is actually much higher. Since the cryptpad crisis, we've reached about 50,000 users per week and 350,000 pads open in a week. And this was actually four times higher than what we had before. So what we have seen also is that the usage of cryptpad has grown a lot in the US. First it has grown because of COVID and then it has also more than doubled in the last two weeks. And we believe that this is linked to the project in the US and to some users recommending the usage of encrypted tool. If we look at the usage of cryptpad over three years, we can see that the effect of the COVID crisis on the amount of pads open on cryptpad, where we went in a year, almost tenfold but in the year before, we were already growing two, three times. And another thing we see is that we see that many of our users are actually recommending other users to use prepad, and we're very grateful for that. And it really helps spreading the usage and showing that it's possible to actually use more privacy friendly tools for collaboration. This data that I'm showing is only based on the CryptPad LFR data, and we also have other instances, so there is 450 other instances and on which we don't have detailed data on the usage. The Cryptpad team is currently three full time developers, so it's actually not a very big team which is handling the development and also handling the main crippled FR instance. The team receives some support from the Suksis team, so from human resources, from marketing, and also from me. We have more than 400 independent instances, and we have a community of users and some administrators that are participating through our matrix channel and that are also helping promoting cryptpad. Cryptpad wouldn't be what it is without the promotion of all the other users that are making it known. So when it comes to the cryptpad funding, and I believe that it's very important to talk about the funding of open source tools, because it's very important to have open source tools. We're strong believers of open source tools, but it's really hard to get open source tools if we don't manage to fund them properly. So CryptPad originated initially from a french funded project by be. So it's a state organization. Funding R and D out of this project was funding Ipswichi and some other companies. This is how we got Cryptpad bootstrapped. At the end of March 2019, this funding ended, and we needed to find some ways to continue that project. And we've been happy to candidate to the NGI pet Zero fund. And NLNet has funded multiple projects that are improving Cryptpad, and we also got a funding from NGI Trust, which has helped us complete the funding for the year. We also have been happy to receive a ten k dollar grant from the Mozilla Open source fund when we candidated to it that cost them this year. This is actually very interesting because it has allowed us to fund the team that was working on cripple and to be able to continue to fund them now. What is also very important is what kind of long term funding we're able to build. So we have a few strategies in place for that. So the first thing is that Cryptalfr is a pain service, and for us it's really important to build it right away as a paid service. Unlike many, many cloud services that are starting for free, initially, trying to reach millions of users, and then are making users discover the paying scheme. For us, we believe that it was really important to show how we believe that we can fund that project long term. So we need funding so that we can get the software off the ground, but then we need a model where it can be sustained over time and we can build a team regularly. As users have increased, we've increased subscribers, and we've reached now 170 subscribers. So now we've reached 170 subscribers on the CryptPad, for instance, which represents €1000 per month of subscriptions. And we also have an open collective. Cryptpad is one of the popular projects on open Collective, and we have 150 donators on open collective that represents €500 per month. All this is evaluated as about twenty k euros for 220. Now, it's important to understand that this is still ten weeks away from being able to completely fund the team only based on revenue, and not being relying on grants to be able to sustain the development. And now we believe that it's possible to continue to increase the usage of the main crypt instance and increase both subscribers and donators. And there's also possibilities to package support services for enterprise instances, which could also bring some revenue to help fund the technology. Well, we welcome any help that it can be. We welcome also contributors that would want to help in continuing to improve this product, and also administrators that want to host instances and make prepad more known and propose a solution so that we're using less non privacy friendly tools and we're using more tools that are protecting our data and our privacy. Thank you very much, and I've been really happy to participate to this Conf 42 conference, and I hope you appreciated this talk.
...

Ludovic Dubost

Founder & CEO @ XWiki SAS

Ludovic Dubost's LinkedIn account Ludovic Dubost's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways