As teams worldwide embrace cloud-native architectures–replacing waterfall releases with CI/CD, monolith applications with multiple layers of microservices, and singular databases with data meshes–real-time visibility across these layers and their communication patterns becomes critical for management, troubleshooting and assurance.
While there are many visibility tools on the market, they mostly operate on the application and infrastructure layer and miss the mark on providing detailed visibility into data endpoints (databases, pipelines, data warehouses, etc.). Various data-endpoint performance and usage metrics remain difficult to track and make it challenging to answer commonplace questions, such as:
- Which services are responsible for the lion’s share of request execution time?
- How does the number of requests from each accessing service change over time, and can we identify bottlenecks?
- Which users are exhibiting suspicious data read patterns?
Real-time visibility into data endpoints lets teams quickly pinpoint what is slow, what is unexpected and what is broken. In this eWEEK Data Points article, using industry information from Manav Mital, CEO of Cyral, we discuss several examples and benefits of this visibility.
Data Point No. 1: SaaS BI tool performance monitoring
Many BI tools such as Looker access the database using a single service user that is shared by the requests coming from all users of the tool. When a bad request from any single user takes a long time to run and affects other workloads running on the data endpoint, it makes it difficult to attribute requests to the individuals responsible for executing them.
Enriching data endpoint visibility with granular end-user information makes it possible to monitor long-running requests and to trace them back to specific individuals for prompt alerting and quick remediation.
Data Point No. 2: DBaaS credits usage monitoring
Modern DBaaS such as Snowflake and BigQuery charge customers based on utilization, which is directly impacted by the cumulative execution time spent by users of the service. The ability to track execution times on a daily / weekly basis for high cumulative usage individuals and to drill down on the reasons for their high usage makes it easy for account and billing administrators to forecast service costs and take remediation steps to keep the costs down.
Data Point No. 3: ETL throughput issues diagnosing
ETL (extract, transform, load) job throughput and performance are affected by a number of factors, such as the connection pool size, ingest batch size and commit frequency. Data endpoints usually lack necessary information to reason about a badly performing ETL job because they do not track these metrics.
Improving data endpoint visibility with granular metrics such as these makes it easy for DevOps and SRE (site reliability engineering) teams to monitor their ETL jobs and diagnose issues due to inadvertent changes that affect their performance.
Data Point No. 4: Trickle data exfiltration detection
Trickle exfiltration involves sophisticated mechanisms to exfiltrate data from an endpoint. The objective of an attacker is to stay “under the radar” of network and security monitoring tools by adopting a low and slow approach to exfiltration.
All trickle exfiltration attacks rely on what are called point requests (fetching a small set of specific rows), range requests (fetching a small range of rows) and offset requests (fetching a small set of rows from a different offset each time).
Tracking metrics such as the rate of requests and the number of data records read over time and correlating with other data activity in that time frame adds extra visibility making it easier to detect when an attacker or an insider is actively involved in theft of sensitive data.
Data Point No. 5: Abnormal service behavior detection
Theft and abuse of credentials meant for applications such as ETL jobs are often hard to detect because data endpoints do not have the means to distinguish regular applications from a rogue application. Being able to detect changes in behavior, such as when a presumed ETL job ends up writing to a data endpoint from where it typically reads, allows DevOps and security teams to identify whether their data endpoints’ credentials have been compromised.
If you have a suggestion for an eWEEK Data Points article, email cpreimesberger@eweek.com.