No video

Advancing Spark - Implementing Row Level Security in Databricks

  Рет қаралды 7,509

Advancing Analytics

Advancing Analytics

Күн бұрын

RLS, or Row Level Security, is another one of those "maturity features" that is often used as an argument to demonstrate how lake-based platforms are still behind the more mature relational data stores, however from Databricks Runtime 7.3, we now have a solution!
This week, Simon looks into the is_member() function and how we can use it to implement secure, performant security within a lake-based data model! This has HUGE impacts on how successfully we can model warehouses within a lake structure, and is a great thing to see!
More info on those Dynamic View Functions here: docs.databrick...
And as always, don't forget to Like & Subscribe, and stop by our site to see if we can help you on your Data Lakehouse journey - www.advancinganalytics.co.uk

Пікірлер: 22
@Obizzy8
@Obizzy8 3 жыл бұрын
Very useful feature for a Enterprise lake house approach! Thanks for the constant great content 👍
@nickhurt8416
@nickhurt8416 3 жыл бұрын
Awesome new capability - thanks for sharing!
@drummerboi4eva
@drummerboi4eva 3 жыл бұрын
Nice stuff !! Super encouraging to see performance is not deterred while using RLS in Databricks
@gulamsardar7799
@gulamsardar7799 3 жыл бұрын
Thanks Again Simon, keeping upto date with Spark has become so simple because of you. Though I am a sql expert, I am struggling a little with scala, can you please guide to some courses which can help me in learning scala(hands on), thank in advance !!
@Monsalvo888
@Monsalvo888 3 жыл бұрын
Really useful, thanks!
@film-masti-777
@film-masti-777 Жыл бұрын
This is good but very basic level. is there any advance use case you can present pls? which includes CLS, table level, RBAC etc.
@SAMSARAN2108
@SAMSARAN2108 Жыл бұрын
Thanks for sharing the RLS concepts in Databricks in this video. I have a requirement like I will create an Azure AD group (SalesAPAC) for Sales Domain- NAM Region combination at Power BI level, then users from this group should have access only Sales related workspace and its reports with NAM region data only from reports/visualizations. The same logic I need to apply the same logic here in Azure Databricks by passing userid, domain and region into Databricks. so user should be able to see only Sales related tables/views/other objects and should fetch only NAM data from those tables. As per this video, we need to create a new group for RLS it seems. Is there any way to sync the Azure AD, use it inside Azure Databricks and define the RLS logic based on the Azure AD group? Regards, Saravanan.S
@bbrocks5530
@bbrocks5530 7 ай бұрын
Did you get the solution?
@SAMSARAN2108
@SAMSARAN2108 7 ай бұрын
@@bbrocks5530 Actually our organisation didn’t move towards Azure Databricks, but we may require this for snowflake. Thanks for reaching me.
@viveksomvanshi3767
@viveksomvanshi3767 3 жыл бұрын
As always, very well summarized. Thanks Simon. Does this mean if we get it working at enterprise level then we don't need any sort of OLAP engine i.e. Sql db, warehouse? Also, in general if I can achieve similar objectives through schema and views in sql then what will be the advantages of RLS since organization structures are generally complex than demonstrated particularly in any RLS concept?
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
I'd hedge my bets more - "in some circumstances, Databricks can function as your OLAP", not absolutely every case, depending on user volumes, query frequency, latency requirements etc, it's a complex question for a yes/no answer! And complexity is what it's about with RLS vs Views too. If I have three company segments, managing that via different views is easy, if I have 100, and they frequently change and evolve, that's a huge headache in code maintenance. Also affects who can do the change - a support team can easily add new groups, add/remove members, but asking support teams to define and maintain views (and apply the relevant security to the object!) is harder. As always, loads of different ways to approach it, it's just another tool in our belt to design the right security model for the problem. Simon
@l_combo
@l_combo 3 жыл бұрын
Thanks for sharing, this is a great start, how do you see this scaling to n dimensions e.g. member of a country (shown), member of a business unit, member of a role etc. otherwise I suspect the more traditional security makes more sense on the layer where the data is being analysed / view such as BI.
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Yeah, it scales as far as you can create groups to back it up. There's a fair bit of potential given you can now create & map users to groups through the API, so a little python utility can do the user mapping for you... but you're right, when you get into decent numbers of different roles, it'll become fairly difficult to look after. That said, the alternative to the "is_member0" function looks at the current user instead, so you could change it to a full, many-to-many user table that does the security, giving you full flexibility inside you model - it's slightly more of a pain to implement though :)
@zycbrasil2618
@zycbrasil2618 3 жыл бұрын
Hi Simon.. Data object privileges right? Does it support column level security?
@nickhurt8416
@nickhurt8416 3 жыл бұрын
Yes see docs.microsoft.com/en-us/azure/databricks/security/access-control/table-acls/object-privileges#column-level-permissions
@umarhussain9334
@umarhussain9334 3 жыл бұрын
Awesome, how expensive is this compared to an analysis service with RLS (say S1)
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
If you take a blunt example of an S1 AAS and a 2-worker Databricks cluster, both ~25Gb RAM. Then it's around £1,100 AAS compared to £1,246 Databricks. But that assumes they're both turned on - Databricks has much better scaling and - the killer advantage - doesn't need to hold all the data, so can be using this technique over masses of data stored cheaply in the lake. If used right, I'd say Databricks is the much cheaper option.
@umarhussain9334
@umarhussain9334 3 жыл бұрын
@@AdvancingAnalytics my thoughts exactly with a properly partitioned using the schema of your choice this becomes so much more flexible. Throw in the power of python and this becomes a good sell. Thanks for the video v helpful
@mohdshoaib3296
@mohdshoaib3296 3 жыл бұрын
on global temp view is-member is not working ..any guidance.thx
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Never tried it on a global temp view, it's worth getting in touch with Databricks to discuss the use case. Worst case... just save it as a persisted view? :)
@BitaRastgar
@BitaRastgar 3 жыл бұрын
it is nice but it is not dynamic. create a view/table using is_member(country) , then if you remove a user from a group, that user will still have access to all data when the view/table was created! it would have been nice if a user is removed AFTER view/table creation, then that user would be allow to see what he/she allowed to see right now. by the way, is this part of SQLAnalytics also?
@gran_turing
@gran_turing 3 жыл бұрын
That's exactly how it works, the filtering happens at runtime not at view creation time dynamically based on the user querying the data. This is part of the security model that is used with SQL Analytics as well as Table ACL clusters.
Advancing Spark - Introduction to Databricks SQL Analytics
25:39
Advancing Analytics
Рет қаралды 11 М.
Advancing Spark - Understanding the Spark UI
30:19
Advancing Analytics
Рет қаралды 51 М.
КАКУЮ ДВЕРЬ ВЫБРАТЬ? 😂 #Shorts
00:45
НУБАСТЕР
Рет қаралды 3,5 МЛН
Prank vs Prank #shorts
00:28
Mr DegrEE
Рет қаралды 11 МЛН
艾莎撒娇得到王子的原谅#艾莎
00:24
在逃的公主
Рет қаралды 54 МЛН
王子原来是假正经#艾莎
00:39
在逃的公主
Рет қаралды 18 МЛН
Advancing Spark - Row-Level Security and Dynamic Masking with Unity Catalog
20:43
Advancing Spark - Databricks Delta Change Feed
17:01
Advancing Analytics
Рет қаралды 14 М.
Protecting PII/PHI Data in Data Lake via Column Level Encryption
32:44
Advancing Spark - Databricks Delta Streaming
20:07
Advancing Analytics
Рет қаралды 28 М.
Advancing Spark - Databricks SQL Variables & Dynamic WHERE
13:36
Advancing Analytics
Рет қаралды 4,2 М.
Advancing Spark - A Super-Quick Guide to Databricks Secrets
9:30
Advancing Analytics
Рет қаралды 16 М.
Advancing Spark - Databricks SQL Analytics & Power BI
25:37
Advancing Analytics
Рет қаралды 10 М.
Advancing Spark - How to pass the Spark 3.0 accreditation!
20:01
Advancing Analytics
Рет қаралды 29 М.
Meshing About with Databricks
35:49
Databricks
Рет қаралды 7 М.
Advancing Spark - Setting up Databricks Unity Catalog Environments
21:21
Advancing Analytics
Рет қаралды 17 М.
КАКУЮ ДВЕРЬ ВЫБРАТЬ? 😂 #Shorts
00:45
НУБАСТЕР
Рет қаралды 3,5 МЛН