-
Notifications
You must be signed in to change notification settings - Fork 358
cost-analysis-export: Support multiple CSV schemas #5579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add dynamic query builder to handle different column names across: - Legacy EA/PAYG (InstanceId, UsageDateTime, UsageQuantity, PreTaxCost) - Modern EA 2024 (ResourceId, Date, Quantity, CostInBillingCurrency) - MCA/MCA Partner/CSP (camelCase: resourceId, date, quantity, costInBillingCurrency) - FOCUS (ResourceId, ChargePeriodStart, ConsumedQuantity, BilledCost) Key changes: - Detect ResourceID and Date columns dynamically for JOIN - Multiply all cost/quantity columns by split fraction - Preserve original column names in output - Add test cases for Modern EA, MCA, and FOCUS schemas
Automatically append trailing slash to AZURE_STORAGE_AKS_DATA_PREFIX and AZURE_STORAGE_COST_EXPORT_PREFIX if missing. Also add logging when no AKS export files are found to aid debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances the cost analysis export tool to support multiple Azure cost export CSV schemas, making it compatible with Legacy EA/PAYG, MCA (Microsoft Customer Agreement), and FOCUS (FinOps Open Cost and Usage Specification) formats. The implementation dynamically detects the schema by inspecting column names rather than requiring explicit configuration.
Changes:
- Added dynamic schema detection that identifies resource ID and date columns from a list of known variants across different Azure billing schemas
- Fixed a bug where missing trailing slashes in storage prefix paths caused file discovery failures
- Added logging to aid debugging when no AKS export files are found
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| examples/cost-analysis-export/main.go | Implements dynamic column detection, adds prefix path normalization in config validation, adds file processing logging, and builds SQL queries dynamically based on detected schema |
| examples/cost-analysis-export/main_test.go | Updates test assertions to check for column presence rather than exact order/content, adds comprehensive test cases for Modern EA, MCA, and FOCUS schemas |
| return fmt.Errorf("building join query: %w", err) | ||
| } | ||
|
|
||
| slog.Info("executing join query", "query", query) |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging the entire SQL query could expose sensitive information or create very large log entries. Consider logging only a summary (e.g., the detected column names) or using debug-level logging instead of info-level for the full query.
| slog.Info("executing join query", "query", query) | |
| slog.Debug("executing join query") |
MCA cost exports use MM/DD/YYYY format while AKS exports use YYYY-MM-DD, causing INNER JOIN to return zero rows. Added date normalization during CSV import to convert all dates to YYYY-MM-DD format.
Helps debug issues when join returns empty results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| if c.Timeout <= 0 { | ||
| return errors.New("EXPORT_TIMEOUT must be positive") | ||
| } | ||
| // Ensure prefix paths end with a slash for consistent path joining | ||
| if c.AzureStorageAKSDataPrefix != "" && !strings.HasSuffix(c.AzureStorageAKSDataPrefix, "/") { | ||
| c.AzureStorageAKSDataPrefix += "/" | ||
| } | ||
| if c.AzureStorageCostExportPrefix != "" && !strings.HasSuffix(c.AzureStorageCostExportPrefix, "/") { | ||
| c.AzureStorageCostExportPrefix += "/" | ||
| } | ||
| return nil | ||
| } |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Config.Validate method modifies the configuration by appending trailing slashes to prefix paths. This is a side effect that violates the principle of least surprise for a "Validate" method, which is typically expected to only check validity without modifying state. Consider renaming this method to "ValidateAndNormalize" or moving the normalization logic to a separate "Normalize" method to make the behavior more explicit.
| if filesProcessed == 0 { | ||
| slog.Error("no AKS export files found", "prefix", a.Config.AzureStorageAKSDataPrefix, "expected_pattern", a.Config.AzureStorageAKSDataPrefix+"export-*.csv") | ||
| } else { | ||
| slog.Info("processed AKS export files", "count", filesProcessed) | ||
| } | ||
|
|
||
| return nil |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error log at line 357 uses slog.Error, but the function returns nil rather than returning an error. This makes the "no AKS export files found" condition a silent failure that may not be immediately obvious to operators. Consider returning an error when no files are found, or at minimum use slog.Warn instead of slog.Error to indicate this is not a hard failure.
| return fmt.Errorf("building join query: %w", err) | ||
| } | ||
|
|
||
| slog.Info("executing join query", "query", query) |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire SQL query is being logged at line 818. For production environments with large schemas, this could result in very large log entries that may impact performance or log storage. Consider logging only a summary of the query (e.g., the detected column names and schema type) rather than the entire query, or making this detailed logging conditional on a debug flag.
| slog.Info("executing join query", "query", query) | |
| slog.Info("executing join query") |
| // normalizeDate converts a date string to YYYY-MM-DD format. | ||
| // Detects format by checking for "/" (MM/DD/YYYY) or "-" (YYYY-MM-DD). | ||
| func normalizeDate(value string) string { | ||
| if value == "" { | ||
| return value | ||
| } | ||
| if strings.Contains(value, "/") { | ||
| // Parse as MM/DD/YYYY | ||
| if t, err := time.Parse("01/02/2006", value); err == nil { | ||
| return t.Format("2006-01-02") | ||
| } | ||
| } | ||
| // Already YYYY-MM-DD or unknown format, return as-is | ||
| return value | ||
| } |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normalizeDate function silently returns the original value if parsing fails. This could lead to data corruption if an unrecognized date format is encountered. Consider logging a warning when a date format cannot be parsed, so that operators can detect and address schema changes or unexpected data formats.
| // Detects format by checking for "/" (MM/DD/YYYY) or "-" (YYYY-MM-DD). | ||
| func normalizeDate(value string) string { | ||
| if value == "" { | ||
| return value | ||
| } | ||
| if strings.Contains(value, "/") { | ||
| // Parse as MM/DD/YYYY | ||
| if t, err := time.Parse("01/02/2006", value); err == nil { | ||
| return t.Format("2006-01-02") | ||
| } | ||
| } | ||
| // Already YYYY-MM-DD or unknown format, return as-is |
Copilot
AI
Feb 4, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normalizeDate function uses a simple heuristic to detect date formats based on the presence of "/" or "-". This approach may fail for edge cases such as dates with both slashes and dashes (e.g., "2025/06-18"), timestamps with time components (e.g., "06/18/2025 14:30:00"), or other variations. Consider adding more robust date format detection or supporting additional common date formats explicitly.
| // Detects format by checking for "/" (MM/DD/YYYY) or "-" (YYYY-MM-DD). | |
| func normalizeDate(value string) string { | |
| if value == "" { | |
| return value | |
| } | |
| if strings.Contains(value, "/") { | |
| // Parse as MM/DD/YYYY | |
| if t, err := time.Parse("01/02/2006", value); err == nil { | |
| return t.Format("2006-01-02") | |
| } | |
| } | |
| // Already YYYY-MM-DD or unknown format, return as-is | |
| // Tries common date and datetime formats and returns the original value if parsing fails. | |
| func normalizeDate(value string) string { | |
| trimmed := strings.TrimSpace(value) | |
| if trimmed == "" { | |
| return value | |
| } | |
| // Build candidate strings to parse: date-only part and full value. | |
| candidates := []string{trimmed} | |
| if idx := strings.IndexAny(trimmed, " T"); idx > 0 { | |
| datePart := strings.TrimSpace(trimmed[:idx]) | |
| if datePart != "" && datePart != trimmed { | |
| candidates = append([]string{datePart}, candidates...) | |
| } | |
| } | |
| // Common layouts to attempt, from more specific to more general. | |
| layouts := []string{ | |
| time.RFC3339, // e.g., 2025-06-18T14:30:00Z | |
| "2006-01-02", // e.g., 2025-06-18 | |
| "01/02/2006", // e.g., 06/18/2025 | |
| "2006/01/02", // e.g., 2025/06/18 | |
| } | |
| for _, candidate := range candidates { | |
| for _, layout := range layouts { | |
| if t, err := time.Parse(layout, candidate); err == nil { | |
| return t.Format("2006-01-02") | |
| } | |
| } | |
| } | |
| // Unknown format, return as-is. |
Summary