Pulling arbitrary data into Domo

This post is part of a series of posts I’m doing about Domo (the Business Cloud) and how I’m using the free tier to analyze metrics around write500.net. Click here for the full listing.

There are almost as many ways to integrate data, as there are types of data, and source systems that hold data. There’s basically no way to cover them all in the space of one blog post.

In terms of Domo, though, all data is tabular – Domo doesn’t deal with binaries, log streams, delimited text files or complex JSON documents. Just plain old CSVs, which makes sense if you’re doing BI for business – most of the things that matter to decision-makers can be represented in table form.

So long as you can get your data in CSV format, there are a good couple of options to get it into Domo. In this case, I want to pull anonymous user data from write500 – just enough to chart user growth.

I’m also working within the limitations of what the free tier of Domo can do. The Enterprise tier is capable of more advanced acquisition, including basic JSON and CSV API endpoints.

In this example, I do have the benefit of being both the developer of the product, and the business user consuming the dashboards – so I can write any solution I need. Based on what the free tier is capable of, I have the following options:

write500-data-page-1

That’s a lot of options – and at some point, I’ve used every one.

From top to bottom:

1. Cloud Storage
I could write a script on the server to package the data in a format that can be uploaded to cloud storage – so a zipped file pushed to Box, OneDrive or Dropbox for Business. Domo then has cloud connectors that will let me pull that down as usable data.

2. JSON API
I could expose the data through a simple JSON API, and use Google Sheets as an intermediary. It’s possible to use the built-in scripting language and some scheduling to populate a Google Sheet, which is then trivial to import into Domo. It’s not very scalable though.

3. Export to CSV
I could write a script to dump out my required data as a CSV file, then pull that over SFTP. This is easier to set up, but still requires some scripting work. I can then pull the resulting data with the CSV/SFTP connector.

4. Direct Database Access
I could use the MySQL connector to hit the write500 database directly. I’d have to open firewall ports, add users, and do a bunch of other setup first. I’m not in favor of doing that much work right now, though.

So we’ll go with 3 – I’ll write a script to produce the exact CSV file I need, then set up the SFTP connector to fetch it.

Deciding what to fetch

There’s something of an art to deciding what metrics to actually pull – you first need to decide what’s actually useful to you, in terms of answering your business questions. In this example, I know I simply need a dataset that looks like this:

User ID Date Registered Is Subscribed
1 2017-01-01 05:30:00 1

How you generate that CSV is up to you – in my case, I wrote a basic Bash script to do that:

bashscript

Ok, so “basic” might be the wrong word to use – but all that’s really doing is generating a CSV file, with the right header line and correctly-formatted data.

That script is set up to run every 30 minutes, and it will keep a fresh copy of users.csv in a dedicated user’s home directory.

Now to import that! Domo has a set of connectors for working with files:

2017-01-02-22_43_45-domo
1. That (+) icon is everywhere.
connect
2. We’re looking for this one.

On the next screen, typical SFTP setup stuff – host, username and password. If you run a tightly-secured system, you might also need to whitelist an IP range so that Domo can reach your server.

2017-01-02-22_46_37-domo
3. Just needs a valid SSH (/SFTP) user.
2017-01-02-22_48_00-domo
4. Voila! Our file.
2017-01-02-22_49_31-domo
5. I want frequent updates!
2017-01-02-22_50_22-domo
6. Name, Describe, and Save

Whew!

So what we just did there – we set up the SFTP connector to fetch that CSV file once every hour. Every time it does that, it will overwrite the dataset we have with new data – that’s what the Replace method means, in the Scheduling tab.

Finally, named and described. It’s helpful to prefix all your datasets with a short project name – makes it easier to find later.

In no time at all, we’ve got our data!

2017-01-02-22_52_29-write500-_-users-domo
Sweet, sweet data.

From here, the next step is to visualize it. That’s a whole topic all on its own (so I won’t go into detail here), but here’s a dead-simple line chart built from that data, to show the trending over time:

2017-01-02-22_54_50-overview-domo
54 opted-in users from a total of 80 or so

So that’s a basic rundown of a CSV connector, end-to-end. This example does lean towards the “more complex” side of data acquisition when it comes to Domo. Luckily, most of the high-value stuff exists in systems that Domo already has great connectors for.

If you want to be notified when I write more of this stuff, check the Subscribe widget in my sidebar, top right. Or you can follow me on Twitter.

Using Domo – First Steps

This post is part of a series of posts I’m doing about Domo (the Business Cloud) and how I’m using the free tier to analyze metrics around write500.net. Click here for the full listing.

I’m a stats nerd. I love numbers, graphs, and digging into them to find meaning. So, when the opportunity came up at my employer (Acceleration) to partner with Domo and deliver cloud-based BI to customers, it was a natural fit. I’ve spent a bunch of time in both the product and the ecosystem, and have recently started using it to run analytics for my own projects.

A few months ago, Domo made free instances available – anyone can now sign up and get a feature-limited account. I did exactly that, and I’m hoping to write a few posts about how to use various parts of it to do really cool stuff.

What is Domo?

Domo is a cloud-based BI tool that’s aimed at enterprises. Their pricing reflects that, with the entry-level accounts sitting at $175/user/month. Which is not cheap. I haven’t checked comprehensively (there are lots of older-school solutions like SAS and Tableau out there) but it would not surprise me to learn that Domo is the highest-priced BI tool on the market.

The one thing that Domo does really well, as compared to other solutions, is data acquisition. The cloud is a very messy place – some vendors have APIs, some don’t and every API is different. It’s a far cry from the relatively straightforward world of ODBC, where vendors all adhered to the same standard for moving data around.

For some enterprises, the one-click nature of getting data in is the single biggest selling point. The transformation and charting and reporting, social features and alerts and the app are all nice – but the fact that a non-technical user can plug in a few usernames and passwords and get a dashboard, that’s the winning stuff.

Domo makes it very straightforward to access common sources (popular cloud tools, social networks, cloud productivity, etc), and equally straightforward to push data in via a REST-based API. I even built a PHP library for that (shameless plug alert).

So for today, I’m gonna get started by getting a Domo instance up and running, and pulling in Google Analytics data to power a simple dashboard.

This is by far the easiest thing I’ve ever done in this tool – the cool stuff comes later!

Getting Started

The signup form is here, and having just tried it, I got my instance invite about a minute later. Of course, the first thing you should do is company setup – just to put your name and face on it!

2017-01-02-20_17_41-domo
Handy little app launcher – everything’s here
2017-01-02-20_17_53-company-settings-domo
Company Settings > Company Overview – just the basics!

There’s also a profile setup, but I’m skipping that here – I’m the only person using this instance for now.

Next up, I’ll use the Domo Appstore to deploy a pre-built Google Analytics dashboard. The Free tier of Domo includes 4 apps, so this is made very straightforward:

2017-01-02-20_18_04-company-settings-domo
1. Launcher > Appstore
2017-01-02-20_18_20-appstore-domo
2. Pretty hard to miss!
2017-01-02-20_19_45-google-analytics-quickstart-appstore-domo
3. Yes please, we’ll try it!
2017-01-02-20_19_58-google-analytics-quickstart-appstore-domo
4. Naming it after the product I’m tracking

From there, Domo builds out a page with the name from step 4, and fills it with a bunch of charts all powered by sample data. This gives you a good idea of what sort of insights you’re going to get – but I’ll skip straight to the relevant part:

2017-01-02-20_20_52-write500-net-domo
5. On the top right, Connect Your Data
2017-01-02-20_20_58-write500-net-domo
6. Connect accounts! Future installs will skip this step.
2017-01-02-20_21_11-request-for-permission
7. Domo helpfully uses oAuth for this part
2017-01-02-20_21_26-write500-net-domo
8. My account is now linked, and can be selected for future app installs (or new datasets).
2017-01-02-20_21_49-write500-net-domo
9. Pick the view to get data for – just one for now.
2017-01-02-20_26_05-google-analytics-quickstart-domo
10. And away we go!

And there we go. I’ve clicked my way to a dashboard.

2017-01-02-20_23_36-write500-net-domo
11. Charts! Really quite simple.

It takes a few minutes to do the initial pull (longer if there’s a lot of data), but eventually it’ll all populate. The datasets that power that dashboard will automatically refresh once per day. The next best thing to do is get the mobile app set up, so you can reach those dashboards on the go:

For now, I’m going to be putting a dashboard together to track the growth of write500.net – it’s a good sample to use, seeing as it’s a source I have full control of, and I am actually going to try identifying trends I can capitalize on to drive user growth.

If you want to be notified when I write those upcoming posts, use that little Subscribe widget, top right of the sidebar.

Mobile is eating the future

I cannot not write about this excellent presentation from a16z:

Mobile is eating the world by Benedict Evans

Two big things that stand out for me (and there’s lots of high-density info in that deck):

Machine Learning
The ability for computers to understand the world around us got infinitely better once humans were taken out of the equation. Neural networks that train against vast sets of data and write their own rules turned out to be a lot more efficient than having human specialists trying to write those rules by hand.

Somewhere in 2016 (or maybe even as early as 2014) we crossed the first Rubicon: Human engineers may no longer be capable of keeping up with the intellectual growth of the machines they used to manage.

Fantastic news for scale and growth – computers can now write better and more efficient software, which in turn gets loaded on to ever-smaller and lower-powered devices. Ambient intelligence is just around the corner.

Slightly worse news for governance and accountability – at what point does the outcome of a program stop being the responsibility of the human engineers, and start becoming the responsibility of the neural network that designed its own decision tree?

The day we need to prosecute a neural network for a crime – that’ll be the second Rubicon. Once software has legal standing, the game changes again. Probably not for the better.

Mobile applied to Automotive
Mobile phones scaled out a hell of a lot faster than PCs ever could (hence the title of the presentation), but one of the things it has made a significant impact on is manufacturing. The halo effect of having so many compact, mass-produced components means that hardware is no longer a true differentiating factor, and it’s much more about the software and services that power those devices.

The same could be true for cars. We might be heading into a future where cars (taking “electric” for granted here) are assembled in the same way that smartphones are today (just by pulling off-the-shelf interoperable components together), and the key differentiator will be the services rendered through that car.

Which leads to the interesting thought of “Automotion-as-a-Service”.

I wonder if SaaS-type pricing will ever apply to the automotive industry. Bundled minutes become bundled miles, personal assistant integrations cost extra, and you get cheaper packages if you accept in-car targeted advertising. Somehow I think that might happen.

Assuming there isn’t a Final Optimization at some point, where the neural networks collectively decide we’re too much trouble to deal with 😉

Apple’s in the AI game!

They’re still not sharing everything, but at least they’ve put themselves on the map.

Apple (AAPL) is opening up a bit on the state of its AI research — Quartz

The self-driving car research isn’t as interesting to me as what they’re doing with image processing and neural networks:

Compared to TensorFlow, Apple can process images twice as fast as Google can. This’ll probably show up in consumer tech eventually (yet another camera update), but if they really do have better image processing, they could lead the field in machine vision.

Robot Wars 2020

In terms of AI research (and overall firepower), there are three major commercial factions spinning up, and there’s also an early (ergo promising) commitment to open-source.

One recurring theme right up front: Both Microsoft and Google have tools for training AIs to play games – OpenAI’s Gym, and Google’s DeepMind Lab. So I guess, if nothing else, game developers will eventually have an easier time when it comes to designing the CPU players!

Microsoft

Microsoft already has some of the “basic” business applications up – machine learning APIs, cognitive services, and so on. In November, Elon Musk’s OpenAI foundation partnered with Microsoft to use Azure as the primary cloud platform.

Some of the cool things in the Microsoft/OpenAI camp:

  • The OpenAI Gym: https://openai.com/blog/openai-gym-beta/
  • OpenAI Universe: https://universe.openai.com/
  • Cognitive Services: https://www.microsoft.com/cognitive-services/en-us/apis
  • Cognitive Toolkit: https://www.microsoft.com/en-us/research/product/cognitive-toolkit/

Microsoft is open-sourcing a decent amount of this stuff too. Not that I’d personally do anything with it.

Their avatar in the ring is Cortana – backed by Cognitive Services, no doubt: https://www.microsoft.com/en/mobile/experiences/cortana/

Google

Google’s got a lot of the same stuff out there that Microsoft has, especially in terms of web services. Where Microsoft has Azure, Google has the Google Cloud Platform. https://cloud.google.com/products/machine-learning/

They’ve also recently open-sourced the entire DeepMind Lab: https://deepmind.com/blog/open-sourcing-deepmind-lab/

Of course, the part I’m personally fascinated by? The Blizzard/DeepMind partnership, to train the DeepMind AI on how to play Starcraft 2: http://us.battle.net/forums/en/sc2/topic/20751114921

(Finally, someone might be able to give those Koreans a run for their money)

IBM

Then there’s IBM, who has a very big, very impressive, and incredibly opaque lead over the competition in one narrow area: Deep Learning.

http://www.ibm.com/watson/

That’s five years ago. That server rack at 02:33 is now probably a fifth of the size, if not totally relegated to the cloud.

Since then, the tech behind Watson has grown a bit. The tech is designed to mine large sets of unstructured data, forming connections, and being able to answer questions based on that data – so it’s got some really powerful analytics.

All of that is being sold as a service, though – don’t expect IBM to open-source anything too soon. Which I personally think will make them irrelevant by 2020.

2020

If the next four years are anything like the last four years (in that everything seems to be speeding up), we’re almost-definitely going to see AI-as-a-Service popping up for lots of different problem domains, personal assistants that can grasp context and finally be useful, and if we’re lucky, the use of AI for things like smarter energy and resource management.

My money’s still, predominantly, on Microsoft to come out as the leader in this new field. Call me crazy 😉

 

On Record

Boy, have I got a story for you.

9296b8b380b562bb19cb754529fcc741cc376025.png
Urination-over-IP, I guess.

So yes, that’s me, in the screenshot above, calling the death of Apple’s relative lead in the app ecosystem wars – on 11 November 2016. When that thread dried up, I told myself that as soon as I got my new .blog domain hooked up to The Grid, I’d write a more detailed post explaining the reasons why.

(Then I found out that The Grid is crap, and set up here on WordPress instead)

For context: That conversation came out of a bit of hand-wringing around Apple’s new Touch Bar (and a few other really odd technical decisions on the part of Apple). One of the recurring themes from hardcore, longtime Apple users is that the “Pro” part of MacBook Pro became disingenuous with this release. The Mac is no longer for professionals.

There’s two main threads behind my comment though, so let me start with the company itself.

Post-Jobs Apple

To put it flatly: Apple died with Jobs.

I don’t mean the company itself – it’s still the most valuable in the world by market cap, and has brand equity second to none. As a going concern, it’ll be going for a very, very long time

I don’t mean the products, though Apple has recently started trimming some of their smaller lines. We’ll have iPhones and MacBooks for years (if not decades) to come.

What I’m talking about is the spirit of Apple – the drive, the mystique, the vision, the quasi-religious, standard-setting, trail-blazing aspect of owning Apple hardware. That’s gone, and the Touch Bar was the final nail in that coffin.

Among the many things that Jobs did, when it came to the Mac he only seemed to have two objectives in mind:

  1. Apple will build the best machine possible – from the hardware to the software
  2. Apple will enable creatives, dreamers and makers

The MacBook itself became a sort of paragon – a golden standard for notebook design. Every generation was thinner and lighter. Apple introduced Retina resolution, the best touchpad on the market (still undefeated in my opinion), unibody design, relentless optimization to the keyboard, and occasional ground-up rewrites to ensure that OSX would remain stable and performant.

Very few of those choices were informed by market forces. When it comes to designing a product to target a market of any sort, most companies will typically do the least they can get away with. There’s a cost/benefit formula to everything: How much money sunk into R&D, versus how many sales are required to make a profit.

Problem with that is, markets are generally full of shit. Consumers don’t know what they want until you put it in front of them – one of the many insights that drove Jobs, and by extension, Apple.

It didn’t matter to Apple that they were sinking far more R&D time into parts of the device that most consumers wouldn’t ever touch. It didn’t matter which way the market was going at any one point. If they made a decision (even if unpopular) – consumers be damned. Apple owned the game.

And that worked out very well for the makers – the photographers, video editors, software engineers, designers and artists. They could rely on successive generations of the MacBook Pro line to thoroughly equip them to create better things, no doubt driven by an obsessive CEO who was never satisfied with the output.

When you build the best, and sell the best, inevitably you attract the best. There’s a small, but significant halo effect that Apple has created here: Their hardware has attracted the best developers. Not just the developers that code for money, but also the ones that code because it’s their lives.

And when those developers need to solve a problem, chances are they’ll use the best tools at their disposal to do so. Over time, this meant that the brightest and most capable engineering talent accumulated within the Apple ecosystem. It should be no small wonder that the Android and Windows app stores are currently seen as second-class citizens, or that MacBooks are effectively mandatory at any tech startup.

All of that talent has a material impact, too. Consider the iPhone – the hardware tends to stay ahead in some areas, the operating system is often criticized for being feature-limited, but the app store is second to none.

Which does make a big difference. I’ve owned several different phones, the best of which (hardware-wise) was a Lumia 1520. Brilliant screen, camera, battery and touch surface, but the apps were barely functional, and less than two months into the contract I bought a new phone out of sheer frustration. I know I’m not the only one.

You can do much more (and much better) with an iPhone than you can with any other device, which is why this chart should not come as a big surprise:

2af803fa-c0dd-4158-b453-61aa05e59e55.png
Respective FY-2015 totals taken from audited financials

In FY 2015, the Apple iPhone product line alone has generated more revenue than entire competitors.

And don’t underestimate the ripple effect to major software vendors. Producers of high-end creative software packages (Photo manipulation, video editing, sound editing) aim for the Mac platform because that’s where the high-end creatives are. If that market starts drying up, so too do the updates that regular users benefit from.

No wonder everyone – users and analysts – think that Apple is unstoppable.

Except that it is, because it’s failing to do two things right now.

First: In the short term, it just failed to equip high-end creative professionals with the best possible hardware. In the wake of the new MacBook ‘Pro’ lineup, long-time Apple users are starting to talk about defecting to other platforms. This will eventually have a degrading impact on the Apple ecosystem as a whole, especially if another vendor is standing by to give those power users what they need.

Which Microsoft neatly did with the Surface Studio this year – a desktop machine aimed squarely at designers and digital artists. If they release a better notebook for developers, I’m willing to bet quite a few luminaries would seriously consider switching to the Microsoft ecosystem.

But there’s another thing that Apple’s failed to do, and I think this is the one that really matters.

Computing is changing

For as long as we’ve kept records, we’ve needed to process them – and for a very long time, manual processing was OK. It’s hard to imagine now, but there was a time that drawing bar graphs was an actual profession.

Computers came along as a grown-up version of basic calculators. The biggest benefit was that you could change how the computer processed information, by providing more information: software.

Since then, all we’ve really done is more of the same. Computers have gotten millions of times faster at processing instructions, and computing languages have been developed to make programming within reach of almost everyone.

Software has been getting more powerful and more sophisticated over time, but has always been bound by a very simple constraint: It required a human to learn how to program a machine. Before you could make a computer do anything, you need to understand how to solve the problem yourself, then instruct the computer how to solve it.

That’s starting to change with recent advances in Machine Learning (more specifically, Neural Networks). It’s a simple but powerful layer of abstraction: Instead of telling computers what to do, we’re teaching them how to decide what to do for themselves.

Here’s a simple example: https://quickdraw.withgoogle.com/

That’s a simple neural network game. It processes the images you draw, and matches them against similar images that other people have drawn. Over time, it learns to recognize more variations of objects. There may come a time when it has so many variations stored, its accuracy becomes close to human (if not perfect).

However, that machine was not programmed by a human to recognize every possible shape. It was built, instead, to find patterns, and to correlate those patterns with ones it’s seen before. That’s the difference.

Problem-solving itself is starting to change. In future, instead of solving problems by writing machine instructions, the heaviest problems will be solved by building machines that can learn, and by training them to solve those problems for us.

Today, solving these problems requires the use of cloud computing. Sure, you can run a small neural network on your laptop, but to give it true power it has to scale out to hundreds of computing nodes in parallel.

And so, today, there are two vendors which are leading the field here: Google and Microsoft.

Google’s Cloud Platform is exposing APIs that developers can start using today to add natural interactivity to their applications, and they’re already doing cool things with neural networks – for instance, zero-shot translation.

Microsoft, through the Azure platform, is building towards making even stronger capabilities available to end-users. They already claim human-parity speech recognition, and have recently partnered with OpenAI.

Personally, I think Microsoft is leading this race. Google’s got the edge on quantity – their entire infrastructure is geared towards processing large amounts of data and finding useful relationships. Microsoft, on the other hand, has way more infrastructure, a stronger research team, and seems better-equipped to tackle the more interesting use cases.

In any case: Apple is nowhere to be found. The company that built itself on the quality of its hardware, equipping high-end creatives and reaping the benefit of their participation in the ecosystem, has precisely zero play in the AI game.

So that’s the rationale behind my position. Vendors other than Apple are building the tools that power-users of the future will require, and so will attract more power-users. They, in turn, will have the same halo effect on the Microsoft (and/or Google) ecosystems.

Ergo, I’m “on record”:

I can call it right now: The day will come when the Windows app ecosystem rivals the OSX ecosystem for quality, and after that, we’ll come to think of Apple vs Microsoft as Myspace vs Facebook.

To wrap it up in tl;dr terms:

  • The future game is about AI, neural networks and machine learning
  • The winner will be the vendor that can solve for the most complex problems in the most cost-efficient way
  • Microsoft is currently positioned to build the best hardware/software/services ecosystem to enable developers to do just that

Apple will lumber on as a consumer brand. The core value proposition (Apple hardware is for makers) has now been sacrificed on the altar of market forces. Whoever comes up with the best ecosystem for AI will win on the software front, which will be the only front that matters in the end.