Efficient and smart storage of time series

2 replies
edmondo1984
Joined: 2011-09-14,
Dear all,
I have the following use case, and I would like to hear your suggestions.

I have to store data in t,y where t is a time instant and y is the value of y=f(t)

In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.

class Data(values:Array[Double], pointsFrequency:Int) {

final def apply(month:Int) = values(month/pointsFrequency);

}

Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.

I end up in having a complexData

class ComplexData(subdata:IndexedSeq[Data]) {

final def apply(month:Int)

}

What is the best implementation you can imagine ? :)

Best Regards

Tim P
Joined: 2011-07-28,
Re: Efficient and smart storage of time series

Hi Edmondo
Important questions that would help understand what you want
a) how much data are we talking about
b) how do you process it (sequentially, random search by time interval ...)
c) how space efficient or fast does it really need to be?
d) are you accessing all the values or just sampling
e) what exactly do you mean by low t and high t in
> for low t I want to store very
> frequent data, for higher t I want to store less frequent data.

On 10 January 2012 10:21, Edmondo Porcu wrote:
> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of
> y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that
> efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very
> frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>

Sciss
Joined: 2008-12-17,
Re: Efficient and smart storage of time series

if you are willing to read through CS papers, maybe start from something like this

to gather ideas.

it all depends on your population size and the performance requirements.

obviously you need some sort of subsampling. if your distribution is approximately logarithmic that's probably the easiest to codify.

best, -sciss-

On 10 Jan 2012, at 10:21, Edmondo Porcu wrote:

> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>