Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Simple JSON based data storage - what would you recommend?

by blindluke (Hermit)
on Oct 14, 2014 at 12:39 UTC ( [id://1103731]=perlquestion: print w/replies, xml ) Need Help??

blindluke has asked for the wisdom of the Perl Monks concerning the following question:

Enlightened Monks!

I'm working on a dashboard of some sort, showing current data for a number of defined views. Right now, I'm wondering about storing the data in a simple, elegant way. I want to store the data for a view as a JSON document, together with a timestamp. The frontend would ask the backend for the most recent document. The documents would be pushed to the storage periodically with another tool. Both the frontend and the data provider are Perl based.

I would like to ask for your opinion - what approach should I take? What tools should I use? These are my current considerations:

  • CouchDB - because of the HTTP based communication, and its ease of use
  • DBIx::CouchLike - over SQLite
  • something else entirely

Maybe I'm too close to the problem to see the simple solution. I only need two things from the API - to be able to store a JSON string, and to be able to retrieve the most recent JSON string stored.

I seek your wisdom and recommendations. Both will be appreciated.

- Luke

  • Comment on Simple JSON based data storage - what would you recommend?

Replies are listed 'Best First'.
Re: Simple JSON based data storage - what would you recommend?
by BrowserUk (Patriarch) on Oct 14, 2014 at 13:30 UTC
    I only need two things from the API - to be able to store a JSON string, and to be able to retrieve the most recent JSON string stored.

    Presumably the views are accessed using some name or number.

    It'd be pretty hard to beat your file system for performance for something as simple as this. Each view is simply an appropriately named file within a directory.

    Have the background process write the json file to a temp directory, and then, when complete, delete the 'live' file and rename the new one into the live directory.

    The foreground process simply slurps the named view from its file in the 'live' directory, and uses a (say) 1/10th second sleep-before-retry, if the file doesn't exist at the moment it tries to slurp it.

    Assuming a relatively small (low hundreds or less) number of views; and relatively small files (say a few kb); any half decent file system will do a pretty good job of keeping the 'hot' views in cache and will take care of the 'refresh on write' situation automatically.

    For such a simple use-case, anything else is pretty much overkill.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thank you. You're right - I was overthinking the whole thing. I'll update the files using File::Temp's tempfile and rename, and I'll store them under the same dir, with file name corresponding to the view name.

      Then, on the frontend, I'll just have a handler associated with the path '/data/viewname' that returns the viewname file contents (JSON).

      There is a blurt_atomic sub in Sysadm::Install that handles the tempfile/rename routine, I'll either roll my own in a similar way, or just use the module as an added dependency.

      Again, thank you, both for the time, and the good advice.

      - Luke

        As long as the back-end is a single machine and not a load-balanced farm *and* the number of saved files is relatively small, this will work fine. A shared filesystems across a farm presents it's own set of challenges as does a large number of files in one directory. For load-balanced farms, I would pin the service to one machine and if the number of files is expected to be greater than 10K, I would use some type of directory partitioning scheme to keep the number of files in a directory low (for those times when you need to manually read the dir).

        -derby
Re: Simple JSON based data storage - what would you recommend?
by erix (Prior) on Oct 14, 2014 at 13:05 UTC

    If running a big RDBMS is feasible: PostgreSQL has pretty good json support, extended in the upcoming v9.4.

    PostgreSQL docs: JSON datatypes (links to 9.4, but 9.3 links are easily reached from there)

    postgres/mongodb JSON comparison (EnterpriseDB is a postgres support company)

      Thanks for the reply. It's definitely feasible, but I'm looking for a simple solution. As I said, there are no relations between the data other than time precedence.

      - Luke

Re: Simple JSON based data storage - what would you recommend?
by thargas (Deacon) on Oct 14, 2014 at 13:52 UTC
    Sounds like a cache might work. Something under CHI would allow you to experiment with various back-ends without having to write them. It would probably end up slower than what BrowserUK suggests, but you'd write less code. Depends on what's important to you.

      This is a good suggestion because of the flexibility and depending on usage, RAM, and strategy choices seems likely to me to be much faster than file based stuff. If it’s a few hundred smallish JSON “views,” it could all live in memory and only need IO on cache misses/expirations. If it’s mostly needing IO then it would be slower since it’s just one more layer of housekeeping on top of it all. As you say, the CHI backends make trying it out in different ways quite easy.

Re: Simple JSON based data storage - what would you recommend?
by tinita (Parson) on Oct 15, 2014 at 17:53 UTC
    I have been using git to save JSON data for a project in the past. This had the advantage that I didn't need to implement storing history or getting a diff.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1103731]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-25 22:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found