When I run a query with AWS Athena, I get the error message ‘query exhausted resources on this scale factor’. Why?
AWS Athena is well documented in having performance issues, both in terms of unpredictability and speed. Many users have pointed out that even relatively lightweight queries on Athena will fail. One part of the issue may be due to how many columns the user has in the Group By clause – even a small amount of columns (like less than 5 columns) will run into this issue of not having enough resources to complete. Other times it may be due to how much data is being parsed, and again even small amounts of data (like less than 200MB) will run into this issue of not having enough resources to complete.
Presto stores Group By columns in memory while it works to match rows with the same group by key. The more columns that are in the Group By clause, the fewer number of rows that will get collapsed with the aggregation. To address this problem, users will have to reduce the number of columns in the Group By clause and retry the query.
And still at other times, the issue may not be how long the query takes but if the query runs at all. Users that experience “internal errors” on queries one hour will re-run the same queries that triggered those errors and they will succeed.
Ultimately, AWS Athena is not predictable when it comes to query performance. That’s where Ahana Cloud, a managed service for Presto, can help. Ahana’s managed service for PrestoDB can help with some of the trade offs associated with a serverless service.
Some of the reasons you might want to try a managed service if you’re running into performance issues with AWS Athena:
- You get full control of your deployment, including the number PrestoDB nodes in your deployment and the node instance-types for optimum price/performance.
- Consistent performance because you have full control of the deployment
- Ahana is cloud-native and runs on Amazon Elastic Kubernetes (EKS), helping you to reduce operational costs with its automated cluster management, increased resilience, speed, and ease of use.
- No limits on queries
Plus you can use your existing metastore, so you don’t need to modify your existing architecture.
Check out the case study from ad tech company Carbon on why they moved from AWS Athena to Ahana Cloud for better query performance and more control over their deployment.