Welcome to another issue of Clojure weekly, my small routine blog contribution to the Clojure sphere! These are just a few links, normally 4/5 urls, pointing at articles, documentation, screencasts, podcasts or anything else that attracts my attention. I add a small comment so you can decide if you want to look at the whole thing or not. That’s it, enjoy!
ohpauleez/clojure-leap You know that where a Java SDK is provided there is also a Clojure library waiting to be built. If you want to impress your friends look no further: the Leap Motion Controller is a 3D motion detector device that attach to your USB port: https://www.leapmotion.com. This tiny library is a first attempt at managing the thing from Clojure. This project it’s a work in progress and a fun way to learn some Clojure.
Developing Patches - Clojure Community - Clojure Development Ever wondered how to contribute to Clojure? That is an instructive way to learn the language and help the community around it. Joining the dev-mailing list is certainly the first step and it requires some time for the agreement that you need to sign and send to Rich (cool, really, Rich is receiving requests from worldwide aspiring clojure committers directly at his desk). The next step is really grab any of the bugs and requests in http://dev.clojure.org/jira/browse/CLJ and have a look around to see how bugs are discussed in Jira and sometimes on the mailing list. After that, you’re good to go and this is a very well detailed guide on how to contribute to Clojure.
Manning: Big Data Big Data is the new book that Nathan Marz and James Warren are writing to describe an architectural approach to Big Data called Lambda Architecture. It consists of three main layer: a batch layer, a serving layer and a speed layer. The basic principle behind the lambda architecture view of data is that information is immutable (hence the lambda name). Similar to Datomic, changes to data are recorded as logs. The serving layer is constantly updated from data coming from the batch layer and it also contains a first aggregation of data plus indexing that makes querying faster. Since the serving layer is not up to date by definition, the speed layer is there to fill that gap. The speed layer can be truncated every time new views are materialized in the serving layer. This architecture should solve the problem of having query = function(all-data) no matter how big all-data is going to be. The reason behind f(data) is possible is that immutability allow for horizontal scaling.
What’s new in midje 1.6 · marick/Midje Wiki In the latest alpha3 of the coming 1.6 Midje release, several changes have been made to make Midje compatible with standard continuos integration requirements to show output of test suites as part of the CI report. There are also a few other improvements in midje customisation for colors, autotest scope and more.
About the Data Set - Common Crawl - Confluence Bookmarking this although not strictly clojure related. Clojure is often used to solve big data problems, a business that requires scripting capabilities, raw performance power and a good set of supporting library (I’m thinking at all the s3, EMR, hadoop wrappers out there, plus Cascalog and Storm). Well, if you happen to have a spare 81TB on your laptop drive you can play around with the common crawl dataset, the raw data you can obtain when you request 6 billions web pages, http headers included. The archive is already split into multiple S3 files based on the date of crawling, so it might be a better idea to use and ec2 instance or EMR for your processing. Just saying…
clojure.java.io - Clojure v1.5 API documentation clojure.java.io resource is a typical swiss army knife that works in many contexts. It returns the URL of a resource that is visible from the root class path (if not specified otherwise). Used with “slurp” for example can be used to bring into memory plain text in a single liner provided the text is available on the classpath.11 months ago