Monitoring an infrastructure is one of the most trivial things around, isn’t it? Yet a lot of people are still not happy with the way current monitoring tools work. Last summer John Vincent (@lusis) started a trend on twitter and #monitoringsucks was born. Today it’s a Twitter hashtag, an irc channel and also a github group, but mostly it’s a group of people talking about how to build better tools, or how to glue tools that already exist together.
But why does monitoring suck? No better an explanation than quoting John Vincent himself from his blog.
Monitoring is AWESOME. Metrics are AWESOME. I love it. Here’s what I don’t love:
* Having my hands tied with the model of host and service bindings
* Having to set up “fake” hosts just to group arbitrary metrics together
* Having to collect metrics twice – once for alerting and another for trending
* Only being able to see my metrics in 5 minute intervals
* Having to chose between shitty interface but great monitoring or shitty monitoring but great interface
* Dealing with a monitoring system that thinks it is the system of truth for my environment
* Not actually having any real choices
You could also add the lack of automation possibilities to most monitoring solutions, as this is the criteria that limits choice most. So even with a huge amount of open source tools out there that bits and pieces of are right, people aren’t happy, and every couple of weeks a new effort to solve monitoring – or at least to improve it – starts. The monitoringsucks crew have set up a github repository pointing to all kinds of different tools that are around.
Most people agree that there are plenty of good tools around if your infrastructure is small-to-medium sized. It starts to be more problematic when your infrastructure grows, when you have more and more items to monitor and more and more items to measure. With the introduction of “infrastructure as code” people want to be able to deploy a service automatically, and also include monitoring in that deployment; that’s one area where current tools are not ready. And another area of course, is scaling the monitoring solution itself. If you need a full time DBA to manage the database where your metrics are being sent because otherwise it is too slow to accept new metrics, there is a problem. Add to that the fact that most monitoring tools are written with static environments in mind, or at least infrastructures in a local network. Not with a flexible cloud style environment where machines are being decommissioned even faster than they were provisioned.
All of these problems are popping up in today’s web infrastructures. All of the larger web applications, the social networking sites, the online shops and their friends, are starting to feel the pain, so they are looking for problems. Should this be solutions?
As the devops community are pretty keen on monitoring, it’s no surprise that there is a huge overlap within both communities. This means there is a large group of people out there who are investigating new ways to tackle the problem, and they are sharing their experiences, sharing their code.
In February Inuits, the leading Open Source consultancy, hosted a 2 day hack session in their offices, which gathered people from different stakeholders who were trying to solve the problem of monitoring. People from large sites such as TomTom, Booking.com, Atlassian, Spotify, and others were present to discuss and share their ideas.
One of the core ideas that came out of this discussion was to go back to the old unix philosophy and build a chain of tools that work closely together, each specialized in their own area. With small changes to existing tools, a tool chain could be built that would collect data and throw it on to a message bus. Then other tools could be listening to that bus to transform that data, make statistics out of it, base alerts on it, graph it, archive it, or perform analytics on top of it.
Plenty of tools exist in the area and plenty of new tools are popping up almost weekly, but one thing is for sure… monitoring is not a solved problem yet. It still sucks.
Kris Buytaert is a long time Linux and Open Source Consultant. He’s one of instigators of the devops movement, currently working for Inuits.
Kris is the Co-Author of Virtualization with Xen ,used to be the maintainer of the openMosix HOWTO and author of different technical publications. He is frequently speaking at, or organizing different international conferences
He spends most of his time working on Linux Clustering (both High Availability, Scalability and HPC), Virtualisation and Large Infrastructure Management projects hence trying to build infrastructures that can survive the 10th floor test, better known today as the cloud while actively promoting the devops idea !
His blog titled “Everything is a Freaking DNS Problem” can be found at http://www.krisbuytaert.be/blog/.
Devops, Devops, Devops, … the whole world is talking about Devops, but what is Devops ?
I have a slide deck where I tell people that a Devop is a 30 something Senior Infrastructure guy with a strong Development background, a lot of Open Source Experience, who is mostly European (.be / .uk) who likes Belgian Beer and Sushi. Of course that is an absolutely correct description the people involved in the early days of the Devops movement, but it has nothing to do with what Devops really is about. It’s just a fact that the first Devopsdays conference was held in late 2009 in Gent, and a lot of the people involved in the early days were these Senior Infrastructure guys.
But devops started out much earlier than this, but we’ll never know when it started because it’s nothing new, it’s just a new common name for things people have been already doing for ages. Devopsdays Europe started because a group of people met over and over again at different conferences throughout the world. These people were talking about software delivery, deployment, build, scale, clustering , management, failure, monitoring and all the important things one needs to think about when running a modern web operation. These people included Patrick Debois, Julian Simpson, Gildas Le Nadan, Jezz Humble, Chris Read, Matt Rechenburg , John Willis, Lindsay Holmswood and Kris Buytaert. On the other side of the ocean O’Reilly put on a conference that sounded interesting to a bunch of us Europeans : Velocity, but on this side of the ocean we had to resort to the Open Source , Unix, and Agile conferences. We didn’t really have a meeting ground yet. At CloudCamp Antwerp, in the Antwerp zoo, Patrick Debois decided to organise Devopsdays Gent.
Devopsdays Gent was not your traditional conference, it was a mix between a couple of formal presentations in the morning and open spaces in the afternoon. And those open spaces were where people got most value. The opportunity to talk to people with the same complex problems, with actual experience in solving them, with stories both about success and failure. How do you deal with that oldskool system admin that doesn’t understand what configuration management can bring him? How do you do Kanban for operations while the developers are working in 2 week sprints? What tools do you use to monitor a highly volatile and expanding infrastructure ?
Yet I still haven’t defined devops! :) So .. what is devops ? Lets start off with agreeing that we haven’t agreed on a real definition yet, and never will, but there is a lot of common ground that defines the ideas behind it. The idea that there needs to be more communication between the different stakeholders in an IT lifecycle – developers, operation people, network engineers, security engineers – need to be involved as soon as possible in the project in order to facilitate each other and talk about solving potential pitfalls ages before the application steps into production. Some people therefore say that devops is the wrong terminology and that it needs to be *ops, but if you say that out loud it becomes starops, which might lead to the confusion that it’s about rockstars in operations, which is not really the goal we’re trying to achieve, we’re trying to build teams that work together cross discipline with a joined goal.
Lots of people also think the devops ideal includes continuous delivery. The idea that you should be able to deploy multiple times a day, that automation will help you make such deployments possible and not problematic, not that it’s about being able to, but about having to. Obviously a full chain of automation from development to production helps here, adequate testing needs to be in place on all levels – functionality, scalability, availability, security.
Also the idea that the involvement of developers doesn’t end at their last commit, so that operations just pulls in the code and makes it run, but that they are also involved in keeping it running smooth. After all software with no users has no value. The involvement of the developers in the ongoing operations of their software shouldn’t end before the last end user stops using their applications. This also means that software developers need to provide the nessecary hooks in their application so we can automate testing, and measuring different components of the application both during test and during operation.
Summarizing this in an acronym coined by John Willis and Damon Edwards – CAMS. CAMS says devops is about Culture, Automation, Measurement and Sharing, but its definitely not a SCAM.
On September 8th, Prague will be holding/hosting the MATTONI Grand Prix – a chance to run at night through the illuminated streets. The Old Town Square and its surroundings will be animated with live music, dancing and many other great attractions. I am sure that you can feel the joy and the wonderful sense of achievement – that is until you meet my friend Bogomil Shepov aka Bogo.
After being an employee for many years, Bogo has decided to start his own company. Nothing new or particular about this you will say… but promotion of the new company is a little different as Bogo is using Twitter and his legs.
For every re-tweet Bogo receives, he will run one meter during the Grand Prix, he will put the sender’s twitter name on his t-shirt and donate $0.05 to either:
- Creative Commons
- Mozilla Foundation
- Electric Frontier Foundation
- Eclipse
- Apache
I should say that I never saw Bogo run – he is as athletic as I am. My favourite picture of him is sitting in a restaurant with a pint of beer or a bottle of Rakia discussing the state of politics or some techie innovation. Read more about Bogo here. And please retweet his message, I want to see him run and run… To do so click this message and add a comment together with your Twitter username.
I don’t think Bogo understand the art of sponsorship but I am sure he knows how to promote his company.
Elastacloud, in conjunction with Microsoft, will be holding a one day conference on Windows Azure in Fulham, West London. There will be prolific speakers discussing all aspects of the cloud across a range of new and interesting topics.
The day will see speeches by prolific Microsoft authors and evangelists as well as members of the UK partner community. The conference is open to a wide variety of individuals from different backgrounds who may be new to Windows Azure. We are looking to reach out to the PHP and Java communities to show them that an open source (WAMP) stack is wholly compatible with Windows Azure. There will be talks on PHP, Java and node.js by well-respected developers describing the how-to as well as projects which have successfully implemented on Windows Azure.
In addition, there will be talks on mobile and Windows 8 integration with the cloud demonstrating how the cloud has become a fundamental part of Microsoft’s newest operating system and mobile offering.
Talks will be held on other aspects of Windows Azure such as High Performance Computing, Big Data, Cloud Storage. Newer concepts such as Infrastructure as a Service, advanced messaging and the creation of secure hybrid applications will be discussed in depth as well. There will be a keynote speech given by a prolific member of Microsoft in the morning and in the afternoon the conference will break into 3 tracks.
All conference registrations will be free up until the 20th May and £25 thereafter so get registering now!
The day event is being run via the UK Windows Azure User Group which has built a strong community presence evangelising the use of Windows Azure to developers throughout the UK. The user group can be found at here and on Twitter @ukwaug.
Richard is the founder of the dynamic Windows Azure User Group. He is the first point of contact for user group members wanting to participate in community codeplex projects.
Fluidinfo is a shared online datastore based on tags with values. It allows anyone to store, organize, query and share data about anything.
Users add information to Fluidinfo by associating data to things via named tags. It’s easy to extract information and there’s a simple query language to search the datastore.
Five core concepts cover how Fluidinfo works:
- Objects represent things (real or imaginary).
- Tags attach information to objects.
- Users use their own tags to to attach data to shared objects.
- The permission system controls who can read and write each tag. (Objects do not have owners, so anyone can tag any object.)
- Queries select objects by specifying properties of their tags and values. The selected objects can be read or tagged.
Objects representing “things” are identified by a globally unique “about” value string. There is a (lazily instantiated) object in Fluidinfo for every possible unicode string—”paris”, “☾♠♣♥♦☽”, “book:getting started with fluidinfo (nicholas j radcliffe; nicholas h tollervey)”, “http://oreilly.com”, “3.14159265” and so on.
Each user attaches arbitrary named tags to any of these objects. The tags can have typed values; for example, I might tag the object about “paris” with a numeric rating (ntoll/rating=10), a string description (ntoll/description=”Belle”) and a JPEG image (ntoll/photo=<a jpeg blob>).
The permissions system lets users to control who can read and write each of their tags, and a simple yet powerful query language allows them to select objects and read data based on their own tags and those of other people (subject to the permissions). For example the following query matches objects that I (ntoll) have added a comment to, that both the user njr and I have added a rating greater than 6 to and whose about values contain the string “book:”:
has ntoll/comment and (ntoll/rating > 6 and njr/rating > 6) and fluiddb/about matches "book:"
Notice how tag names are scoped by a username (for example ntoll and njr in the example above) so the provenance of the data is apparent.
A RESTful API
At the lowest level, interaction with Fluidinfo is via a pure HTTP, RESTful API. Being HTTP-based, the API can be queried directly from a web browser. For example, you can directly read my rating of Paris by visiting: http://fluiddb.fluidinfo.com/about/paris/ntoll/rating (the result will probably be downloaded as a file containing the value 5) or (better) by using cUrl or wget from the command line:
curl http://fluiddb.fluidinfo.com/about/paris/ntoll/rating
Similarly, you can find all the objects I have rated greater than three and retrieve the “about” value and my rating by navigating to:
http://fluiddb.fluidinfo.com/values/?query=ntoll/rating>3&tag=fluiddb/about&tag=ntoll/rating
The response will be JSON. If you have any familiarity with URL semantics, you’ll be able to see that this URL goes to the /values endpoint of Fluidinfo, passing the query parameter “ntoll/rating>3” and two tag parameters “fluiddb/about”, which is the name of the object identifier (Fluidinfo’s so-called “about” tag), and “ntoll/rating”, the tag I’m using to rate things in Fluidinfo. This is the Fluidinfo equivalent of the following faux-sql:
SELECT ntoll/rating, fluiddb/about WHERE ntoll/rating>3;
Writing data is only marginally more complicated than this, requiring authentication and sending the appropriate data in JSON.
Client Libraries
While it’s entirely possible to work with Fluidinfo via curl, it’s probably a better idea to use a client library. For example, the following Python session illustrates the simple fluidinfo.py module which forms a thin layer on top of the HTTP API (there are more abstract libraries such as FOM [the Fluid Object Mapper] and the event-driven fluidinfo.js).
Use the pip tool to install the fluidinfo.py module:
$ pip install -U fluidinfo.py
... lots of downloading/unpacking messages...
Import the required modules (we use pprint to make the output more readable):
$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import fluidinfo
>>> from pprint import pprint as pp
Ensure we’re logged in:
>>> fluidinfo.login('ntoll', 'secret_password')
The library returns the HTTP headers and the data from Fluidinfo as Pythonic data structures you can use immediately:
>>> headers, result = fluidinfo.get('/about/paris')
>>> pp(headers)
{'cache-control': 'no-cache',
'connection': 'keep-alive',
'content-length': '424',
'content-type': 'application/json',
'date': 'Mon, 16 Apr 2012 13:41:25 GMT',
'server': 'nginx/0.7.65',
'status': '200'}
>>> pp(result)
{u'id': u'881d95b2-e9f0-40c8-a11e-964f349e01b1',
u'tagPaths': [u'fluiddb/about',
u'musicbrainz.org/related-artists',
u'wordtools/gcide',
u'wordtools/foldoc',
u'musicbrainz.org/related-albums',
u'terrycojones/testing/test1',
u'terrycojones/rating',
u'musicbrainz.org/related-tracks',
u'en.wikipedia.org/url',
u'r/r',
u'book/related-books',
u'book/r',
u'fergusstothart/rating',
u'fergusstothart/Old_Map',
u'fergusstothart/Map',
u'ntoll/rating']}
Adding a new value to the object about “paris” is simple:
>>> headers, result = fluidinfo.put('/about/paris/ntoll/comment',
... 'I prefer the tower in Blackpool. :-)')
>>> pp(headers)
{'cache-control': 'no-cache',
'connection': 'keep-alive',
'content-type': 'text/html',
'date': 'Mon, 16 Apr 2012 13:42:34 GMT',
'server': 'nginx/0.7.65',
'status': '204'}
The 204 status code shows it was a success. The same applies for deleting the value:
>>> headers, result = fluidinfo.delete('/about/paris/ntoll/comment')
>>> headers['status']
'204'
It’s possible to sign up for an account via Twitter at the http://fluidinfo.com website. You must ensure you set your password for API access (click on your user profile to do so).
Flying Spaghetti Monster
If Fluidinfo sounds interesting, give it a spin. And if you have any interest in the back end, you could do worse than to check out the new O’Reilly book “Getting Started with Fluidinfo” with a magnificent jellyfish on the cover. Alternatively, if you’d like to find out more about the story behind Fluidinfo then check out Fluidinfo – An Unfundable World-Changing Start-Up.
(About the author: Nicholas H.Tollervey works for Fluidinfo and co-authored the O’Reilly Fluidinfo book. He’s a classically trained musician, philosophy graduate, teacher, writer and software developer. He’s just like this biography: concise, honest and full of useful information.)
In late 2008 I was working as a freelance Python developer. I’d recently started to use Twitter and among the various “famous” people I had followed was Tim O’Reilly. One of his tweets caught my attention; it was about an interview that tech-blogger-about-town Robert Scoble had recorded with a hacker who was building “something amazing” called Fluidinfo (so amazing that Scoble described it as “the unfundable world-changing start-up”).
Here’s the summary notes I jotted down while trying to make sense of what I was watching:
- A re-imagining of a data store in the cloud.
- Openly writeable (anyone can add data a la Wikipedia philosophy).
- Simple to find useful information by mashing up data from many different contributors.
- An evolutionary model of data / emergent schema.
- Bottom up rather than top down.
After a couple of hours of unlit, uncut and rather shonky video shot in the hacker’s apartment in Barcelona, my first thought was, “this hacker’s either a complete nut-job or he’s on to something”.
I had to find out more.
It turned out that the hacker was called Terry Jones and, as they would say in the Matrix, went under the “hacker alias” of terrycojones.
Terry, it appeared, had balls.
A quick look at the biography on his website indicated that he’s actually Terry Jones BSc MSc PhD ~ unicyclist (see below), mathematician, chess player, computer scientist, virologist, writer and entrepreneur (not necessarily in that order). I sent him a long rambling email asking for more information about the [r]evolutionary nature of Fluidinfo.
Three months later I got a reply with detailed answers. There was obviously substance to the ideas and claims first made in the Scoble video. At the end of the email Terry invited me to proof-read the developer documentation for the soon-to-be released Alpha version of Fluidinfo. Contained therein was a thing of beauty: a simple RESTful API backed up by an elegant philosophy of how a data store should behave.
I was hooked.
I was also going to the next EuroPython conference in 2009 and so, it appeared, was Terry: the very first talk of the very first session was called “Introducing Fluidinfo”.
Terry and I chatted for a while after his presentation and I learned that he’d been thinking about and planning Fluidinfo for ten years. It quickly became apparent that Fluidinfo was a labour of love. Terry had sold his apartment to finance development of Fluidinfo, given up a career as a computational biologist at one of the world’s leading universities to work full time on it and was in the process of getting a round of funding sorted out before the money (and any chance of getting his apartment back) ran out.
Terry deserved his “hacker alias”.
As with many developer conferences, some of the best stuff at EuroPython happened in the “corridor track”, that setting halfway between Josette’s O’Reilly book stall and the coffee bar. It was serendipity that had me sit down in the corridor next to someone I initially thought was an Irish hacker (by the look of him at least: he had a ruddy complexion, a reddish beard and was wearing green – to me that looks “Irish”, perhaps I was thinking leprechaun). I was wrong, the guy was from Barcelona, a member of the Apache Software Foundation and called Esteve. We had a long and often funny conversation at the end of which I asked the inevitable, “so who do you work for?”. His immediate answer, delivered in the same way that Inigo Montoya introduces himself in the Princess Bride, was “I work for Fluidinfo and we’re going to change the world”.
Terry wasn’t the only one connected with Fluidinfo who had balls.
It turned out that Robert Scoble was wrong: a few months later Fluidinfo received funding through Betaworks and can count Joshua Schachter (Delicious founder), Esther Dyson (journalist and investor) and Tim O’Reilly (whose tweet originally brought my attention to Fluidinfo) as investors. Last year, when Fluidinfo won the “Best Technology” category at the Launch Conference, Scoble was heard to say, “I was wrong”.
When Fluidinfo got funded I joined as employee #3 (we’ve since grown to around nine people spread around the globe working as a distributed team via the wonders of IRC, Skype and Github).
Having high profile investors is very helpful. For example, during an interview at the South by South West gathering Tim O’Reilly named Fluidinfo as his favourite start-up. Such PR gifts are usually followed by a flurry of blog-posts and tweets from people who, upon discovering Fluidinfo, announce how amazing the concept is. A typical example being, “this should blow your mind: Meet Fluidinfo, the most disruptive start-up you haven’t heard of”.
Herein is perhaps the biggest challenge facing us: despite being lauded by the tech-cognoscenti we need to take the abstract philosophy and existing technology that underpins Fluidinfo and make it available and understandable to everyone.
To this end we’re working with and learning from various large companies who would like their own instance of Fluidinfo running within their organisation for the purposes of data sharing and information management. We’re also listening carefully to our existing “early-adopter” users and introducing new features where required. Furthermore, we’re experimenting with a web-based front-end for non-technical “civilian” users. We hope that the fruits of our labours will be arriving within weeks.
In the meantime, there is an O’Reilly book called “Getting Started with Fluidinfo” (with a magnificent Flying-Spaghetti-Monster like jellyfish on the cover) which explains the API and philosophy behind what we’re building. Alternatively, a simple technical introduction to Fluidinfo will be appearing on this site very soon.
More information about Fluidinfo can be found at our website or at our blog.
(About the author: Nicholas H.Tollervey works for Fluidinfo and co-authored the O’Reilly Fluidinfo book. He’s a classically trained musician, philosophy graduate, teacher, writer and software developer. He’s just like this biography: concise, honest and full of useful information.)
In March I attended Codemotion in Rome – an event for the community, created by the community. The event has grown yet again not only in the number of attendees (over 3000) but also from a one day conference to a full two days and a concurrent event in Madrid. There was not a free minute but talk after talk.
Talks
On Friday, the conference was divided into 5 tracks:
- Hot topics
- Web
- Mobile
- Innovation
- Enterprise
On Saturday, we had 7 tracks:
- Open
- Web
- Languages
- Methods
- Gaming
- Security/HW
- Web/Mobile
With almost 100 talks, all the hot topics were covered: Cloud, JavaScript, HTML5, Agile, Arduino, Big Data, Hadoop, JAX-RS 2.0, Gamification and much more.
Don’t forget your CV
All the sponsors had a stand at Codemotion but they were not there to present their products but to collect CVs. This event allows people to present themselves to key companies, drop their CV, have a chat with HR people and hopefully they will get the job of their dream or at least start the hiring process. The Codemotion people also offer tips on CV writing, presentation etc. Sponsors included Samsung, Telecom Italia, Oracle, Red Hat, Microsoft and many more.
Design
I liked the design by Woma – I like the idea of cubes that can be piled up in different ways to say different things.
Fellow traveller
Surprise! On Friday morning, as I was setting up, I met Arun Gupta (Oracle) who very kindly brought me breakfast. Nice! I first met Arun during the first part of the week at the 33rd Degree conference in Krakow. He came to Rome to give his talk on “JAX-RS 2.0: RESTful Java on Steroids” and then flew to Codemotion Madrid before going back home to the US. I don’t think I could have attended yet another conference without a little break.
Future
See if we can make it to 4000 attendees next year. Hopefully see you there for a well-organized, great meeting and “Non omnia possumus omnes” but coming to Codemotion might be the first step towards your goal.
From Student House to the White House
In 2009 when the White House was looking for a more flexible system to power their website, a wide range of applications were evaluated and the decision was made to use Drupal, a popular Open Source licensed content management system (CMS).
The system that was picked to power the website of the U.S. President started life a decade earlier in a student’s residence in Antwerp University, as a bulletin board for students to post personal news and discuss where to meet for diner and drinks.
It’s creator, Dries Buytaert, continued developing it further to experiment with newly emerging Internet technologies, and when he released it under an Open Source license, other developers from around the world joined in with their own experiments. Drupal quickly developed into a cutting edge application that was powering major websites.
10 Years Of Growth
The success of Drupal can be attributed largely to it’s very modular framework, which enables new features to be added cleanly and easily by installing discreet modules, and there is an impressive library of over 10,000 modules now available.
The modularity of the system has also led to the growth of a huge community of individuals and organisations contributing to Drupal and providing a range of support services. The drupal.org website currently has 800,000 registered users and almost 5,000 people attend yearly Drupal annual conferences.
Major organisations such as the BBC, ITV, Channel 4, MTV, Unilever, and Comic Relief are using Drupal across a variety of projects.
Drupal For Government
In the UK Drupal is popular with local councils, and national government is increasingly adopting it also as part of it’s commitment to use open source software “where it delivers best value for money”, most notably on high profile projects such as data.gov.uk, Cabinet Office, and Royal Mail.
The London Drupal Community is therefore organizing –
Drupal For Government
Drupal For Government is an event showcasing the Drupal work of several government departments with an evening of presentations and discussion organised by the London Drupal Community in association with Capgemini.
Entrance is free, please register here to attend.
Monday, April 16, 2012 at 6:00 PM
Capgemini, 40 Holborn Viaduct, London EC1N 2PB
http://drupal4government2012.eventbrite.com
Yet another conference that one should not miss according to Dubi Rajakovic, Vice President, Europe Link Ltd, organizer of SourceDevCon 2012. The conference will be held at the Park Plaza Riverbank, London on May 2-5. I hope to see you there!
SourceDevCon 2011, held in Split, Croatia in May 2011, proved to be the homecoming for mobile and desktop web application developers around Europe. The conference received top marks for content, education, and overall experience.
This year’s conference will bring speakers from all around the globe to present on topics of interest to web application developers and entrepreneurs. The stellar lineup features Douglas Crockford, Seb Lee-Delisle, Remy Sharp, Martha Rotter, Joe McCann, Rich Waters, Brian Moeskau, Jay Garcia, Mats Bryntse, and many more experts from the field.
Covered areas include modern HTML5 and CSS techniques, client and server side JavaScript, source control management practices, embedding apps into social media platforms, web games development, and more.
Highlights of this must-attend event for web application developers include sessions on reaching excellence with Sencha Touch, Ext JS, PhoneGap, Node.JS, MongoDb, Git, CanvasJS, Ruby, Rails, PHP, and more.
Three main conference goals are to deliver information, networking, and a fun experience to the attendees. This year’s SourceDevCon will also feature award-rich contests, all attendees being eligible to participate!
The schedule starts with a day of training in building web apps with JavaScript, ExtJS, and Sencha Touch, featuring Modus Create, followed by the Welcome reception later that evening.
Two days of intensive multi-track presentation-style sessions mark the central two days of the conference. The event closes with a bombastic day of fun and networking, details being held secret for now!
Correction: I will not be there on the last day and no, I was not let into the secrets….. Hope you have a great time!
Infobright was founded in 2005 to leverage a mathematical approach, called Rough Set, to solve data management and analytic problems. Rough Set started with a Polish computer scientist Zdzislaw Pawlak in 1981 as a mathematical tool to deal with vague concepts. In his theory, Pawlak describes the lower and upper approximations of a set as crisp or conventional, defining the upper and lower boundaries. This approach is useful for rule induction from incomplete data sets, and in other variations, also helped to form approximating sets known as fuzzy sets. Dominik Slezak, one of Infobright’s founders, explains this as “we follow the rough set approach to identify: (1) the data portions that are fully relevant to the given query execution; (2) the data portions that are fully irrelevant to the given query execution; (3) the data portions that remain undecided.” This theory when applied to data mining and machine learning was quickly adopted by a variety of industries with different applications.
Infobright’s founders realized that Rough Set is a powerful tool to enable fast queries against large data sets without doing all the database administration work that had always been a requirement in the past to achieve fast performance. Instead of requiring indexes, data partitioning and other typical techniques, intelligence in the software could drive performance. Infobright calls this intelligence the Knowledge Grid. Wedding the Knowledge Grid with a columnar database architecture produced a powerful solution that could handle large amounts of data fast and simply, at a very low overall TCO.
In 2006 Infobright formed a partnership with MySQL, taking advantage of the “storage engine” architecture that MySQL had to encourage other companies to create new databases for different use cases while taking advantage of many MySQL functions. This integration meant that migrating from a row-based MySQL database to the Infobright columnar-based database would be as simple as a command line change.
Infobright first introduced its technology, then known as Brighthouse, at the 2007 Rough Set Conference in Toronto to a very positive reception. A few months later, in 2008 the company released the industry’s first commercial open source analytic database software (infobright.org) and started building a strong and growing open source user community, with more than 15,000 downloads in the first year. Within a year, Infobright had more than 40 customers including ISV OEM customers who embedded Infobright’s solution in their own software offerings.
Since 2008, there has been a lot of other changes. For one thing, the product is now called Infobright not Brighthouse. There are tools to integrate with major BI partners such as Pentaho, Jaspersoft, Talend and Informatica. Users can load data several different ways depending on their needs – using MySQL loaders, distributed loading, or many other ETL tools. Customers have reached data load speeds of up to 200,000 records per second. Infobright positioned themselves as the leader in the open source data warehousing community, and soon after, for recognition of their outstanding contributions to the MySQL Ecosystem, Infobright was awarded the prestigous MySQL Partner of the Year award by Sun Microsystems in April 2009.
Using built-in intelligence, Infobright’s unique way of storing and analyzing machine-generated data has provided the vehicle to near real-time analytics in big data. Machine-generated data has become one of the fastest growing categories of big data, with sources ranging from web, telcom network and call-detail records, to data from online gaming, social networks, sensors, computer logs, satellites, financial transaction feeds and more. This focus on machine-generated data within big data, beginning in 2010, gave rise to the rapid increase in customer momentum. And our latest version, released last summer, included Hadoop connectivity, as well as the introduction of Domain Expert and Rough Query. Developed exclusively by Infobright, Domain Expert uses specific intelligence about machine-generated data to automatically optimize how data is stored and how queries are processed. Rough Query leverages our Knowledge Grid to deliver data mining “drill down” at RAM speed, otherwise known as “Investigative Analytics.”.
From the beginning our executives and engineers recognized that the looming database challenge was how to analyze and extract actionable knowledge from very large (and growing) data sets. Clearly the market agrees as terms such as “big data” and “machine-generated data” become more commonly used. In addition, companies appreciate that Infobright’s approach – to work smarter not harder – means that their users can get fast query response – even to ad hoc queries – without a high overhead of database administration or hardware costs.
In 2011, eight of the top ten telecommunications service providers worldwide used, and continue to still use Infobright to mine their big data through our OEM partners. Hundreds of customers use Infobright daily and more than 100,000 users have downloaded both our community and enterprise editions. Infobright is still leading the industry and paving the way.
By Craig Trombly
Infobright Open Source Community Manager