DirectQuery Modeling Gotchas in Power BI
By ClearEdge Intelligence
DirectQuery in Power BI seems straightforward: instead of importing data, you query the source directly. Real-time data, no refresh scheduling, smaller file sizes.
In practice, it's full of sharp edges. Here are the gotchas that have bitten us (and our clients) over the years.
Gotcha #1: The Aggregation Problem
In Import mode, Power BI pre-aggregates data during refresh. A table with 10 million rows becomes efficient lookup tables.
In DirectQuery, every aggregation happens at query time against the full dataset. That "simple" card visual showing total sales? It's running a full table scan every time the report loads.
The fix: Create aggregation tables in your source database. Use composite models to import aggregated data while keeping detail data in DirectQuery.
Gotcha #2: The Relationship Penalty
Every relationship in your model generates JOIN operations in DirectQuery. A model with five related tables might generate five-way JOINs for basic visuals.
We've seen reports go from 2-second load times to 45+ seconds just by adding a few lookup tables.
The fix: Flatten your model where possible. Denormalize dimension attributes into your fact tables. Fewer relationships = faster queries.
Gotcha #3: Calculated Columns Don't Work Like You Think
In Import mode, calculated columns compute once during refresh and store the results.
In DirectQuery, calculated columns must be computed for every query. Complex DAX in a calculated column can devastate performance.
The fix: Push calculated columns to the source. Create computed columns in your database view or use database-level calculations instead of DAX.
Gotcha #4: The Time Intelligence Trap
DAX time intelligence functions (SAMEPERIODLASTYEAR, DATEADD, etc.) work differently in DirectQuery. They often can't be folded into efficient SQL, resulting in row-by-row processing.
A year-over-year comparison that takes 1 second in Import mode might take 30 seconds in DirectQuery.
The fix: Pre-calculate time comparisons in your source. Add columns for "same_period_last_year_value" directly in your data warehouse.
Gotcha #5: Row-Level Security Overhead
RLS in DirectQuery adds WHERE clauses to every query. If your RLS logic is complex (multiple conditions, lookups to other tables), query performance suffers significantly.
The fix: Simplify RLS rules. Use single-column filters where possible. Consider pre-filtering data in the source based on user context.
Gotcha #6: The "It Works in Development" Problem
Development typically happens with a subset of data or against a less-loaded server. DirectQuery performance issues often only appear with:
- Production data volumes
- Concurrent users
- Peak database load times
The fix: Test with production-representative data and load. Run performance tests with multiple concurrent users before deploying.
Gotcha #7: Measure Dependencies
Complex measures that reference other measures create nested queries. Each level of nesting can multiply query execution time.
[Gross Margin %] = DIVIDE([Gross Margin], [Revenue])
[Gross Margin] = [Revenue] - [COGS]
This innocent-looking pattern generates multiple sub-queries in DirectQuery.
The fix: Flatten measure logic where possible. Accept some code duplication in exchange for better query plans.
Gotcha #8: Connection Limits
DirectQuery opens connections to your source database. Each user session, each visual, each interaction potentially opens new connections.
20 users with 10 visuals each could mean 200 concurrent connections. Many databases have connection limits that you'll hit surprisingly quickly.
The fix: Monitor connection usage. Consider connection pooling at the database level. Use aggregations to reduce query volume.
When DirectQuery Actually Makes Sense
Despite these gotchas, DirectQuery is the right choice when:
- Data freshness is critical - You need data updated within minutes, not hours
- Data volumes are huge - Terabytes that can't practically be imported
- Source is already optimized - You have a well-tuned data warehouse designed for analytical queries
- Security requires it - Data must not leave the source system
The Composite Model Middle Ground
Power BI's composite models let you mix Import and DirectQuery in the same model:
- Import dimension tables and aggregations
- DirectQuery for detail data
- Best of both worlds for many scenarios
This is often the right answer when pure Import feels too stale and pure DirectQuery feels too slow.
Performance Testing Checklist
Before deploying a DirectQuery model to production:
- [ ] Test with production data volumes
- [ ] Simulate concurrent user load
- [ ] Check database query plans for full table scans
- [ ] Monitor database connection usage
- [ ] Test during peak database load periods
- [ ] Verify RLS doesn't kill performance
- [ ] Check all time intelligence calculations
DirectQuery can work beautifully when done right. But "done right" requires understanding these gotchas and designing around them from the start.
Related Articles
From Spreadsheet Chaos to Governed Metrics
Every growing company hits the point where spreadsheet reporting breaks down. Here's how to recognize when it's time to evolve, and how to make the transition without losing your mind.
How to Pick the Right KPIs for Your Supply Chain Dashboard
Not all metrics deserve dashboard real estate. Here's how to identify the KPIs that actually drive decisions and avoid the vanity metrics that just create noise.
Building a Property Scoring Model: Data Sources That Matter
A good property scoring model is only as good as its data inputs. Here's the data hierarchy for real estate investment analysis, from essential to nice-to-have.
Stay Updated
Get insights on AI and automation delivered to your inbox.