Multi tenancy and security

Are there any plans to include support for multi tenancy, and built in security, or is it expected that other parts a system should be responsible for such scenarios.

Both are on our list.

I have particular interest in multi tenancy. Could you discuss a bit your requirements?

My initial thoughts are something like:

  • Streams and Projections are available only to assigned user.

  • Metastreams pr user, ie the $all stream is sensitive to the connecting user and include events only from streams owned by that user.

Summarized I suppose you could compare it to having “a database” pr user on a sql setup.

I might however be missing something important in this naive summary.

-Vidar

What about file level isolation? Do you back each of them separately? What about when running clustered with consistent hashing?

I would vote for physical container separation. Required by auditors (at least in a NL).

11 окт. 2012, в 16:53, Greg Young [email protected] написал(а):

This can be done now with vnode process per. Doing huge numbers however would not be fun :slight_smile:

I discussed this with my sysadmin guy, and concluded with having separation all the way down to the file level would be preferable.

As to the clustering, I’m afraid I can’t provide any feedback on that as I’ve got absolutely no experience with such scenarios.

Would you mind elaborating the consequences of doing separate vnode process (assuming this means spinning up separate instances of the server for each user) compared to an integrated multi tenancy setup?

I assume memory usage would be a concern?

-Vidar

This all seems quite reasonable with a relatively small number of
tenants and could be done today. The memory isn't even that much of an
issue. A tenant at minimum levels will only use about 200-300mb of
memory. Since you would be manually placing nodes you could even
manually partition across multiple node groups today. The key here is
that its manual.

The place where I am very interested in getting to however is a
slightly different scenario. You can imagine the ES is partitioned
across 50 nodes. You then want to have 40 tenants. This is relatively
easy to do if we say that there is one conceptual data set and its
partitioned a level above (eg we have one data space sharded on 50
servers (think if I were to prefix the names of streams with the
customer name)). Its a bit more tricky when we say that we want 40
data spaces as it becomes a different kind of scaling.

In the latter case you don't want to shard out one giant dataset you
want us to automatically handle fault tolerance and distribution of 40
separate small datasets that are completely isolated. eg: if you lose
2/3 and are unable to build a quorum you would want us to
automatically reallocate nodes in the group to handle this dataset.
This I think will require some thinking and work in order to get
going.

Does that make sense?

Cheers,

Greg

After reading your reply I can see how the models differ.

To give you some numbers on our use case, we’re considering about 3 or 4 nodes with 70 tenants, growing with about 10 - 15 pr year.

Our system is not really high volume on neither frequency nor amount of data, so multiple nodes are only for redundancy.

-Vidar

This is in the area of feasbility to do manually but you would also
likely want some support in terms of managing. Let me brain storm a
bit if there is something simple that can be done without the full
blown mechanism including consistent hashing.

Resurrecting this thread - what’s your current thinking on this?

Same question for my company.
Any updates ?