The marketing site says there’s full text search. The paper doesn’t cover this. ...

amilich · on Aug 23, 2023

Done completely client side. We actually have multiple blogs on this - https://skiff.com/blog/private-search

raggi · on Aug 23, 2023

So if I have 15gb of email, I have to process all of that on every client bootstrap?

How do you plan on scaling your service with respect to this problem? once you have a non-trivial user base with non-trivial data volumes this is likely to become a substantial problem.

raggi · on Aug 23, 2023

To put this in context, the trivial example of a user with a 15gb account, say you happened to be using s3 for storage. They buy a new phone, that costs you ~$1.50 that month, or 50% of your revenue at current pricing. They buy a new iPad and a new laptop? You’re 50% in the red.

Similarly you’ll have some users who are, say, content creators. They shove a 10gb video in their drive. Let’s say they have a laptop, a workstation, a phone and an iPad. Their upload is downloaded at least 3 times, costing you ~$4.50.

amilich · on Aug 23, 2023

This isn't how it works at all? We don't pay for storage on users devices... buying the device = buying the storage. It's actually much more efficient than doing search through some massive database.

raggi · on Aug 23, 2023

Based on the blog you referenced up thread:

I upload a large document to your drive product from my workstation. I go to search on my phone. My phone needs to download the content in order to index it. My phone downloads the content from you. You pay for the bandwidth.

If I provision a new device, and it needs a new search index, it needs to download all of my content once, in order to populate the local index of the content.

If I'm something like a youtube content producer, I might put extremely large files in the drive. Per the blog post all the other devices signed into drive will see this new file and pull it down to index it.

So if I upload a 15gb video from my iphone to later process it on the workstation, my laptop, ipad and workstation will all download it. That means you need to serve up 45gb of bandwidth. Cost of operation as described in post above.

onereplyac · on Aug 23, 2023

Depends, on an older phone, downloading all emails just to allow for searches locally won't be very efficient. Log out also becomes a problem, if emails are stored on one device that gets stolen, adversary now has access to the local index since all the keys or on the device usually with no FDE. Meanwhile with gmail a log-out would clear all traces instantly.

amilich · on Aug 23, 2023

Also, not really true of Gmail. Try turning your WiFi off, then deleting your Gmail account. You might have mail stored offline on your phone (let alone any other device), as well as any IMAP or other clients. It's the same or worse.

amilich · on Aug 23, 2023

Emails are downloaded when you receive them. Isn't that how email works?

onereplyac · on Aug 23, 2023

Normal email proiders don't dowbload all emails whenever a user logs into a new device

amilich · on Aug 23, 2023

We also don't do this. In a near future implementation you can just synchronize the end-to-end encrypted search index.

raggi · on Aug 24, 2023

This step is what I was expecting you to talk about, and it has some tricky subtleties to get right, which is why I looked for it in the whitepaper.

A trivial problem with a naive implementation is being able to perform presence proofs using side channel information: send someone mail containing a terms you want to verify, and watch for the associated high level costs affecting operations that are likely to be incremental index change uploads.

onereplyac2 · on Aug 23, 2023

You mean you currently do this but plan not to in the future

pseudalopex · on Aug 23, 2023

All common operating systems can encrypt keys or full disks.

mlhpdx · on Aug 23, 2023

Yeah, this is where it gets real. VC funding and unbounded potential operational costs. Ouch.

TillE · on Aug 23, 2023

Seems trivial enough to do client-side on modern hardware, especially if you exclude attachments.

throwaway1566 · on Aug 23, 2023

To do this you would have to download all emails on all devices all the time to index them. Kind of makes the whole point of a cloud based email moot, if everything is on my device anyway, and logging in and out resets it all - might as well use an email client app.

raggi · on Aug 23, 2023

Without producing confirmation side channels and so on? Doesn’t seem that trivial- at least it needs more thought than assumption of simplicity

amilich · on Aug 23, 2023

We do it in a fairly straightforward manner right now. Check out the blog I linked to above - generate an in-memory index, end-to-end encrypt it, and store it browser storage. It's only decrypted in memory.