The technology behind CloudFiler

The technology behind CloudFiler

Is it even possible?

CloudFiler started as just an idea. The aim was simple to state, but we were unsure whether it would be possible. We wanted to create an email filing experience similar to that of earlier email management tools we had worked on, but via any device and without the headaches that customers complained about.

Stuck in a rut?

Initial ideas clung to the belief that emails had to be filed to Windows file system folders and that it had to work whilst offline. With that came the need to maintain a local search index. My co-founder Joseph Anderson and I came up with many ideas which would have worked, but they were seriously complex and that was a worry.

Epiphany

Some chance conversations however made us re-frame the problem. The first realisation was that our assumption that the product needed to work when offline was holding us back. When my first email management product was born back around 2002, few people had a laptop, there was no Wi-Fi in the office let alone on public transport and connections were unreliable, so the software had to handle offline working. Today, things are very different and whilst there are still times when you can find yourself off-line, it’s typically for a few minutes rather than days. This allowed us to rethink the connectivity issues and hence caching of filing operations which significantly simplified the under-the-hood processes.

It gets worse

The second realisation was that people need to file to many more types of storage and not just Windows folders. This made the problem worse, because it would require our software to securely connect to many different systems such as Autodesk Construction Cloud, Salesforce, SharePoint, document management and CRM systems, etc. Filing into these systems could be a challenge but what’s harder is indexing the content.

The problem is that each device has to query the storage system to find out what’s there, as there’s no point in re-indexing material that you already have, and it then needs to download every message to index them. These systems protect themselves from denial-of-service attacks by limiting the load that any one connection can put on it. In other words, they inhibit software from indexing the data. Worse still, the interfaces that they provide are variable in what they allow you to do. So whereas some are straight forward, others have not anticipated the need for external applications to upload and retrieve files from their systems.

So not only would the filing of messages to other systems be problematic, the searching would in some cases not be possible at all.

Thinking differently

This led to the realisation that we were thinking too much about the processes and not the outcome; what people want. Put simply, people want to file messages in a structured way so that they can find them quickly, both of which should be possible via any device, and they also want to file messages into other types of storage.

Now you may be thinking that this statement doesn’t change anything but it enabled us to turn the problem on its head.

So we decided that there needed to be a single source of the truth in the cloud. This would be indexed in the cloud too, removing the need to have local indexers and all the issues with connectivity and content being out-of-date. We then designed the filing interface so that it would work on any device. The last part of the puzzle was syncing selected content to other systems.

It was clear that in some instances we would be able to push files into the target system but in other cases this would not be practical, so we would need to pull the files to the destination from within the target system.

To address this, we designed a connector based architecture using our serverless REST API which is available to all customers, allowing either them or third parties to build their own connectors where needed.

Scalability & Security

The next big issue was scalability. If one customer bulk files a million messages and single user in another business files just one message at the same time, both should feel that the service is as fast as it can possibly be and dedicated to them alone. To achieve this, we needed the system to automatically expand and contract to meet instantaneous demand. So our system will automatically add extra servers when needed and can boot up a server from cold in 0.5 seconds.

Putting the messages in the cloud adds another problem; security. It’s actually three problems. The first is that we need to be sure that individuals can only do or see what their permission should allow. The second is that the data needs to be secure i.e. it’s there when you need it and can’t be lost. The third is that data should be encrypted both when shunted over the internet and when it’s sat on our service, as you wouldn’t want us to be able to read it.

So we needed to build methods to ensure that we could properly authenticate users, ideally via Active Directory so that they don’t have to provide username and passwords, and we also needed to ensure that all gateways are secure, including the API access.

So we built it with PCI compliant banking level security. We encrypt the content both in transit and at rest, and each file is synced to storage on 3 separate servers. We also maintain back-ups on completely separate infrastructure, just in case.

In addition, we use one-time and time-limited security tokens to further protect the system. For example, when a user opens a search window in a browser, they just pick an icon and the search window opens, so to them it seems simple, but behind the scenes the system first authenticates the user and the back-end then creates a temporary one-time link to a secure search window that will only allow them to find the content that their permissions allow. If they copy the URL and paste it into another window, it won’t work. If they leave it open for too long it will expire, and they will have to open a new one.

Measures like these give businesses peace of mind that: the system and their data is safe, that it will be there when they need it, and that it can scale to meet future needs.

Keeping your insurers happy

This is not the end of the story though, as we also knew that some businesses and their insurers are very uncomfortable with the idea of relying on any third party system to hold their data and be there forever, so we designed the connectors to provide continuous syncing of content to storage of your choosing; either on-premis or in the cloud.

CloudFiler has been designed to meet the needs of modern businesses by: supporting all devices that run Outlook, allowing you to store your data wherever it suits you and providing the programming interfaces that allow you to get more from it as your needs change.