12 April 2024

Time Series Aggregations

There are numerous methods for combining multiple time series data. Here are some common methods using TypeScript:

Direct Combination: In this method, you merge the time series data into a single data set using some form of direct combination, such as concatenation. Let's say you have two time series data arrays seriesA and seriesB. You can combine them using concat operation.

let seriesA = [...] // Time series data A
let seriesB = [...] // Time series data B

let combinedSeries = seriesA.concat(seriesB);

Use of zip function in lodash: If 'Time' is equal across all time series and you want to combine values for each specific time, then you could use the _.zip function from lodash for this purpose. It would create a new time series where each time point contains an array of values from each series at that time point.

import _ from 'lodash';

let seriesA = [...] // Time series data A
let seriesB = [...] // Time series data B

let zippedSeries = _.zip(seriesA, seriesB);

Merging based on Timestamp:

If each of your time series data is an array of objects where each object has a 'timestamp' and 'value' key, then you can merge these time series' on the timestamp.

let seriesA = [...]; // [{timestamp: ..., value: ...}, ...]
let seriesB = [...]; // [{timestamp: ..., value: ...}, ...]

let mergedSeries = [...seriesA, ...seriesB];

mergedSeries.sort((a, b) => {
  return new Date(a.timestamp) - new Date(b.timestamp);
});

This technique only works if the timestamps are the same in all series. If they are not, you will need to interpolate the data, which can considerably complex.

One way you can do this is by creating a uniform timestamp series that encompasses all the timestamps across series. Then, for each series, fill in its values at the new timestamps using an appropriate interpolation method. Here's a very simplified conceptual example to illustrate how this may work:

import _ from 'lodash';
import { formatISO, eachMinuteOfInterval, parseISO } from 'date-fns';

// Let's assume you have two time series data as follows: seriesA and seriesB
let seriesA = [{timestamp: '2023-01-01T10:00:00.000Z', value: 1}, {timestamp: '2023-01-01T10:02:00.000Z', value: 2}];
let seriesB = [{timestamp: '2023-01-01T10:01:00.000Z', value: 3}, {timestamp: '2023-01-01T10:03:00.000Z', value: 4}];
 
// Combine both series
let combined = [...seriesA, ...seriesB];

// Sort combined series by timestamp
combined.sort((a, b) => new Date(a.timestamp).getTime() - new Date(b.timestamp).getTime());

// Create a uniform timestamp series that encompasses all the timestamps
let firstTimestamp = combined[0].timestamp;
let lastTimestamp = combined[combined.length-1].timestamp;
let timeRange = {start: new Date(firstTimestamp), end: new Date(lastTimestamp)};
let uniformTimestamps = eachMinuteOfInterval(timeRange).map(t => formatISO(t)); // Generally, use the same granularity as your original timestamps

// Placeholder for the final interpolated series
let interpolatedSeries = [];

// Interpolate each series
[seriesA, seriesB].forEach(series => {
  let interpolated = uniformTimestamps.map(timestamp => {
    // Find the two nearest data points in the current series
    let nearestBefore = _.findLast(series, data => data.timestamp <= timestamp);
    let nearestAfter = _.find(series, data => data.timestamp >= timestamp);
    
    // You may want to choose an appropriate interpolation method here
    // This example uses a simple linear interpolation
    if (nearestBefore && nearestAfter) {
      let progress = (parseISO(timestamp) - parseISO(nearestBefore.timestamp)) / (parseISO(nearestAfter.timestamp) - parseISO(nearestBefore.timestamp));
      let interpolatedValue = nearestBefore.value + progress * (nearestAfter.value - nearestBefore.value);
      return {timestamp, value: interpolatedValue};
    } else {
      return {timestamp, value: nearestBefore ? nearestBefore.value : nearestAfter.value};
    }
  });
  
  interpolatedSeries.push(interpolated);
});

This example uses the date-fns library for date handling, and lodash for its helpful utility functions. Note that this uses a simple linear interpolation method which might not be appropriate for all data – you may want to choose a different method depending on your specific needs and the nature of your data. Lastly, you might want to remove duplicates from the interpolatedSeries if there's a chance that multiple series could have the exact same timestamp value after interpolation.

In that case, after the interpolation step, you can perform an additional step to consolidate the data points with the same timestamp. You can compute the average, sum, median, or use any other appropriate summary statistic of the values for each unique timestamp.

Here's how you can add a step to compute the average of values with the same timestamp:

import { groupBy, map, meanBy } from 'lodash';

// Group the interpolated data by timestamp
let groupedByTimestamp = groupBy(interpolatedSeries.flatMap(series => series), 'timestamp');

// For each group, compute the average value
let averageSeries = map(groupedByTimestamp, (group, timestamp) => ({
  timestamp: timestamp,
  value: meanBy(group, 'value'),
}));

// Sort by timestamp for final output
averageSeries.sort((a, b) => new Date(a.timestamp).getTime() - new Date(b.timestamp).getTime());

For financial time series data, you might want to forward-fill missing values, which means carrying the last observed value forward until a new value is encountered. This method is often used when you don't want to make assumptions about the trend in the data by interpolating.

Here is an example of how you might accomplish this:

function mergeTimeSeries (seriesArray, keys, timestampKey) {
  const allTimestamps = {}
  const seriesMaps = seriesArray.map(series => {
    const map = {}
    series.forEach(entry => {
      const timestamp = entry[timestampKey]
      allTimestamps[timestamp] = true
      map[timestamp] = entry
    })
    return map
  })
  const sortedTimestamps = Object.keys(allTimestamps).sort((a, b) => a - b)
  const lastValues = seriesArray.map(() => new Float64Array(keys.length).fill(0))
  return sortedTimestamps.map(t => {
    seriesMaps.forEach((mapObj, i) => {
      if (mapObj[t]) {
        keys.forEach((key, j) => {
          lastValues[i][j] = mapObj[t][key] || 0
        })
      }
    })
    let mergedDataPoint = { [timestampKey]: t }
    keys.forEach((key, i) => {
      mergedDataPoint[key] = lastValues.reduce((sum, valueArr) => sum + valueArr[i], 0)
    })
    return mergedDataPoint
  })
}

let seriesA: TimeSeries = [{t: 1, value1: 1, value2: 2}, {t: 2, value1: 1, value2: 3}, {t: 3, value1: 2, value2: 4}, {t: 4, value1: 3, value2: 5}];
let seriesB: TimeSeries = [{t: 1, value1: 1, value2: 2}, {t: 2, value1: 10, value2: 20}, {t: 5, value1: 3, value2: 6}];
let keys = ['value1', 'value2'];
console.log(mergeTimeSeries([seriesA, seriesB], keys));

In this code, mergeTimeSeries is a function that combines two time series by "forward filling" missing values. It works by creating a union of all timestamps from both series, then traversing this list in chronological order. For each timestamp, it checks if there is a corresponding value in each series, and if so, updates the last observed value (lastValueA or lastValueB). It sums the last observed values from each series to create the value for the combined series at that timestamp. Finally, an array of new combined entries is returned. Each combined entry has a timestamp from the original series and a value that represents the sum of the last observed values from both series at that timestamp.