Warning: Cannot modify header information - headers already sent by (output started at /var/sites/v/vldbsolutions.com/public_html/blog/index.php:7) in /var/sites/v/vldbsolutions.com/public_html/blog/wp-includes/feed-rss2.php on line 8
http://www.vldbsolutions.com/blog Blogging about Big Data, Business Intelligence (BI) and Data Analytics Tue, 05 Feb 2019 11:10:49 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.8 http://www.vldbsolutions.com/blog/wp-content/uploads/2017/04/cropped-v-for-blog-32x32.png http://www.vldbsolutions.com/blog 32 32 VLDB – We’ve Moved! Returning To Our Founder’s Roots http://www.vldbsolutions.com/blog/vldb-office-move/ Thu, 10 Jan 2019 06:30:16 +0000 http://www.vldbsolutions.com/blog/?p=1588 Myself and the rest of the Liverpool-based members of VLDB recently moved offices. This move represents a 30-year journey back to where I started. Allow me to explain… ‘Back in the day’ there were two main employers in Old Hall St, the centre of Liverpool’s CBD: Royal Insurance (‘The Royal’) …

The post VLDB – We’ve Moved! Returning To Our Founder’s Roots appeared first on .

]]>
Myself and the rest of the Liverpool-based members of VLDB recently moved offices. This move represents a 30-year journey back to where I started. Allow me to explain…

‘Back in the day’ there were two main employers in Old Hall St, the centre of Liverpool’s CBD: Royal Insurance (‘The Royal’) and Littlewoods.

The Royal was founded in 1845 and had its headquarters in New Hall Place, aka ‘The Sandcastle’. It was also the only FTSE 100 company with an HQ in Liverpool. This is where yours truly started his IT career as a graduate trainee in 1988. At the time there were 400 or so IT staff and about 2,000 Royal staff in total in The Sandcastle.

Directly over the road on the other side of the street was the HQ for Littlewoods, named the JM Centre after John Moores who founded Littlewoods in 1923. The Littlewoods empire covered retail and football pools. During the 1980s Littlewoods was the largest private company in Europe.

Somewhat bizarrely, the two large employers on opposite sides of Old Hall St were both very early adopters of Teradata. That’s right, Liverpool was a hotbed of ‘Big Data’ early adoption – dating right back to the mid-1980s, before most bearded hipsters were even born. Imagine that!

A lot of credit for the early adoption of Teradata in Liverpool, Manchester and further afield in the north of England was due to the efforts of my good friend Jim ‘The Phone’ Clarke (RIP) and his Teradata colleagues. Well done chaps.

After leaving The Royal in ’92 I spent the next 10 years as a freelance Teradata contractor. Following ‘a year of grunge’ at Bank of America in San Francisco, I rocked up at TSB in Manchester. After 6 years I eventually left what had become Lloyds-TSB-C&G-Scottish Widows.

I didn’t leave Lloyds lightly, but the role on offer was simply too tempting: I was hired as the technical lead to work on ‘project Zeus’ for Littlewoods. The project was based in Littlewoods HQ: the JM Centre in Old Hall St, Liverpool. No more hundred-mile-a-day round-trip commuting to ‘Sunny Wyth’, as the LTSB folks called Wythenshawe. I was finally based back in Old Hall St after leaving The Royal eight years previously.

Project Zeus represented a £multi-million investment in Teradata’s database and newly acquired CRM application, SAS for analytics, and MicroStrategy for BI. Although Littlewoods had been Teradata users for over a decade at this point, project Zeus represented a major platform upgrade to enable more advanced CRM, analytics, and BI.

During the start of the first quarter of 2001, the Littlewoods Zeus team was extended with Teradata application developers. This included VLDB’s very own Dave Agnew and Stu Pemberton, both fledgling freelancers and both ex-Great Universal Stores (GUS) in Manchester. Like The Royal and Littlewoods, GUS was also a very early Teradata adopter.

Throughout 2001-02 the team toiled away to build the Zeus platform. The views from the 9th floor on the JM Centre over Liverpool, the River Mersey and of North Wales were enjoyed by all. Myself, Dave and Stu also took the required exams to become amongst the first Teradata Certified Masters during this period.

The Shop Direct Group (‘SD’) was founded in 2005 as the merger between Littlewoods and GUS. Dave and Stu were central to the development efforts to merge GUS onto the ex-Littlewoods Zeus platform. SD headquarters moved from the JM Centre to the old Liverpool airport at Speke.

VLDB has occupied various offices in Liverpool. During 2018 we took the decision to look for something nearer the CBD. The ‘funky’ end of town had become a bit tired for our liking.

After a bit of digging around, we got in touch with Bruntwood, who seemed to have a large presence in the CBD. Of particular interest was the fact they owned The Plaza, which is none other than the old Littlewoods HQ – the JM Centre. In fact, The Plaza is their flagship office building in Liverpool.

As a testament to Richard Jackson’s power of negotiation, we managed to secure new offices in none other than The Plaza. Not only that, we’re on the 12th floor facing the ‘Royal Blue Mersey’.

So, after starting out as an IT trainee on the other side of Old Hall St, I’m sat here with my river view writing this blog from VLDB’s shiny new offices, three floors above where VLDB’s senior team met while developing the Zeus platform that still runs at Shop Direct nearly two decades later.

It’s good to be ‘back home’.

 

Royal Insurance Graduate IT Trainees, Class of ‘88

 

The post VLDB – We’ve Moved! Returning To Our Founder’s Roots appeared first on .

]]>
Analytical Maturity Models – Machine Learning and Artificial Intelligence http://www.vldbsolutions.com/blog/analytical-maturity-models/ Tue, 08 Jan 2019 12:47:20 +0000 http://www.vldbsolutions.com/blog/?p=1580 According to a recent Gartner study (Dec ’18), “87% of organisations have low BI and analytics maturity” (Source- ‘Low Bi and Analytical Maturity’) On first impressions, this sounds like ‘shots fired’ by Gartner. So, what can they mean by ‘low BI and analytics maturity’, and can this possibly be true? …

The post Analytical Maturity Models – Machine Learning and Artificial Intelligence appeared first on .

]]>
According to a recent Gartner study (Dec ’18), “87% of organisations have low BI and analytics maturity” (Source- ‘Low Bi and Analytical Maturity’)

On first impressions, this sounds like ‘shots fired’ by Gartner. So, what can they mean by ‘low BI and analytics maturity’, and can this possibly be true?

 

Analytics Maturity Levels

 

There are well-documented ‘Analytics Maturity Models’ that typically range from basic reporting capability via predictive analytics to an operational/active analytic capability.

A typical Analytics Maturity Model consists of 5 levels:

level 1 – ‘reporting’
level 2 – ‘analysing’
level 3 – ‘predicting’
level 4 – ‘operationalising’
level 5 – ‘activating’

We’d also add the following extra analytic maturity levels to the above list:

level 0 – ‘operating’
level 6 – ‘learning’

Let’s take a closer look at the extended BI/analytics maturity levels, from 0 to 6.

Level 0 – Operating

Based on VLDB’s client experiences, we’ve added a ‘level 0’ (operating) below the normal analytic maturity entry point at ‘level 1’ (reporting).

The premise here is that some folks still rely on operational reports produced by the source system(s), and have no dedicated BI/analytics capability. Yes, these organisations do still exist!

Personal data extracts from operational systems, followed by desktop processing in Excel, is endemic at this level of capability, which Gartner generously describes as ‘Basic’.

Folks that rely on operational reports produced by source systems don’t really know what happened, and certainly, don’t know why. Operational reports are often either wrong, so inflexible to be of limited use, or interpreted incorrectly. Worst case, all three.

“But we’ve used these reports for years” doesn’t make operational reports correct, far from it. If you can’t prove operational reports are correct then you shouldn’t rely on them.

Level 1 – Reporting

At the ‘reporting’ level of analytic maturity, organisations are able to understand ‘what happened’.

Level 1 represents a basic KPI report capability, also known as ‘rear view reporting’.

The main challenge at level 1 is to deliver tested KPI reports in the right format, to the right consumers at the right time. Folks that consume KPI reports at level 1 are known a ‘farmers’.

If you can’t prove that your KPI reports are correct or if they don’t always arrive on time and SLAs are missed, then you can’t claim to be at level 1. Shame on you!

Level 2 – Analysing

At the ‘analysing’ level of analytic maturity, organisations are able to understand ‘why it happened’.

Level 2 represents a KPI report capability, with the added ability to ‘drill-down’ to the atomic data so that aggregate values of interest can be understood/defended/debated. Folks that slice/dice the data at level 2 are known a ‘explorers’.

The key point here is that all of the atomic data used for KPI reports must be retained and made available via the BI toolset so that drill-down and slice/dice by explorers is possible. This is a big departure from a ‘reporting only’ capability serving farmers using aggregate data.

Level 2 is the minimum analytic maturity level all organisations should attain.

Level 3 – Predicting

At the ‘predicting’ level of analytic maturity, organisations are able to understand ‘what will happen’.

Level 3 represents the world historically inhabited by analysts wielding SAS, SPSS, KXEN or similar predictive tools.

The development, refinement, and deployment of statistical models predicting things like customer churn, pricing change impact, and loan default rates are typical endeavours at this level of analytic maturity.

Your organisation is at level 3 if you have a small team of oh-so-clever analysts that have created an entire de-normalised copy of your data warehouse somewhere else that they manage themselves 😉

Level 4 – Operationalising

At the ‘operationalising’ level of analytic maturity, organisations are able to understand ‘what is happening’ in (near) real-time.

Over the last few decades, ETL latency has reduced from monthly to weekly, daily and hourly. As latency has reduced we have moved ever nearer to ‘real-time’ analytics. That said, few organisations achieve true real-time analytics.

Technical challenges aside, there is seldom a solid business case for true real-time ‘operationalised’ analytics.

Level 5 – Activating

At the ‘activating’ level of analytic maturity, organisations are able to ‘react to what is happening’ in (near) real-time.

It’s not easy to integrate analytics and operational systems, far from it. As a result, operationalising analytics is an aspiration for most organisations.

Activating analytics is quite scary for us analytics folks. We’re not used to being mission critical and dealing with customer-facing issues that folks actually care about!

Level 6 – Learning

The analytics world is currently ‘all about’ Machine Learning (ML) and Artificial Intelligence (AI). Activities such as ML and AI try and help us mere mortals understand ‘what could happen’?

According to Wikipedia:

“Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to progressively improve their performance on a specific task.”

“In computer science, artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals.”

There are lots and lots of free open source ML and IA tools out there. The gotcha is that you’re going to need some smart folks to drive said tools. Even worse, they’re calling themselves ‘data scientists’ (ugh) these days, and they’re ‘reassuringly expensive’.

A key challenge at the ‘learning’ level of analytic maturity is data preparation. The choice most organisations face is whether to let data management folks wrestle with data science, or let data science folks wrestle with data preparation. Those with the skills to conquer both the data preparation and subsequent ‘data science’ are rarer than the proverbial…

By pursuing ML and AI, organisations are often leaping from level 2 (analysing) or 3 (predicting) and bypassing levels 4 (operationalising) and 5 (activating). This is quite common and is a great way for the oft-overlooked analytics folks to get back on the exec radar. Hurrah!!!

The post Analytical Maturity Models – Machine Learning and Artificial Intelligence appeared first on .

]]>
The Exploration of Programming Languages – Standards of a Coder http://www.vldbsolutions.com/blog/standards-of-a-coder/ Thu, 22 Nov 2018 16:50:38 +0000 http://www.vldbsolutions.com/blog/?p=1569 Having been involved in IT for 30 years I have often been left bemused by the standards used by some programmers and sometimes imposed by programming languages. I started on Cobol. There were strict naming standards for file division, working storage and linkage section variables so you knew what you …

The post The Exploration of Programming Languages – Standards of a Coder appeared first on .

]]>
Having been involved in IT for 30 years I have often been left bemused by the standards used by some programmers and sometimes imposed by programming languages.

I started on Cobol. There were strict naming standards for file division, working storage and linkage section variables so you knew what you were dealing with.

Indentations were mandatory, mainly for the purposes of readability but also to show structure.

Mickey Mouse constructs using GOTO were banned – mainly because flat coding is decidedly unreadable and in some instances undecipherable.

Example of Python Code
Python Code Example | Author: Terezo(2011)

Coding was in upper case only or lower case only. No-one cared as long as it was one case but the ruling was mainly site based. Mainframe sites tended to be in upper case because it was easier to read on a green screen. Lowercase seems to have become more popular since PCs were introduced with colour screens. It also tends to be easier to read, possibly because most of what we read is in lower case by convention. Kids today must think we typed in upper case so the computer realised it was being given a bunch of commands.

I remember someone trying to impose his will by introducing a mixed case standard. The reason was that no-one else was doing it and we needed to be different. He was almost burned at the stake until someone pointed out that such actions may have repercussions, even if we were in Lancashire.

There is also a reason for having single case as well. Not only is it easily readable, especially when used with underscores to separate words, but it reduces the number of times you use the shift button and is, therefore, quicker, especially for beginners to the torture known as typing.

Process documentation was non-existent – the only place to look was in the code itself. Almost all languages have a provision for comments. This is a good thing as files, either on paper or disc, tend to get lost or can become outdated if there are different versions of the code. Comments in the code, as long as they reflect reality, are the best way to document a process. The excuse of ‘I didn’t have time to find the documentation and update it’ can’t exist.

Linux | Open-Source Software Operating System
Linux | Open-Source Software Operating System

I moved on to Linux and SQL. Standards are used with these languages as well. For the same reasons.

By now, you should be getting the idea that if all code in a workplace looks like it was coded by the same person then everyone will be familiar with the style, methods, layout etc. This makes life easy for everyone.

There are those who maintain that as coding is a creative expression then such standards shouldn’t be applied. Nazi state, the SS and all that carry on.

However, building a car is a creative expression and you don’t see Ford letting everyone decide where to put holes, wheels etc. Production environments thrive on having a standard approach.

I was introduced to VB back in the ’90s and quickly learned to avoid it. In the ’00s I was re-introduced via .NET and hated it even more.

The reasons? Variable names beginning with an underscore and camel case naming standards. Whoever thought any of that was a good idea should be tried for crimes against humanity (in my opinion).

Moving on to more recent times, I have been dragged kicking and screaming into the 21st century. I have had to start learning Python.

 

Python | Software | Foundation
Python | Software | Foundation

At first, there was the usual confusion and associated frustration. I tried reading books but they didn’t explain stuff the way I like it explained. I hated it.

Then I found out why Python is like it is. It is designed to be used in a production environment. It has enforced standards to make it readable and uses indentation to limit sections. e.g. no if — fi type constructs where the fi is necessary to show the end of a block.

Python 3 is even easier to use. There is, for example, only one numeric type with all the difficult stuff being done in the background. The same goes for assigning data types to variables.

It is also (like .NET) based on OO methodology (a library of subroutines to us older duffers) so is understood by just about everyone in terms of functionality.

Most importantly, it doesn’t make me use the despised camel case. Even as a committed atheist, I am beginning to believe there may be a God!

Source-

Linux – https://www.linux.org/

Python Code – https://commons.wikimedia.org/wiki/File:Python.png

Python – https://www.python.org/community/logos/

 

The post The Exploration of Programming Languages – Standards of a Coder appeared first on .

]]>
30 Years In IT…You’d get Less for Murder! http://www.vldbsolutions.com/blog/data-industry-advice/ Wed, 07 Nov 2018 10:33:42 +0000 http://www.vldbsolutions.com/blog/?p=1525 A few weeks ago marked 30 years since yours truly started out as a graduate IT trainee at Royal Insurance (now RSA) in 1988. Back then, there were no PCs, just dumb terminals; no mobile phones, just landlines; no email, no internet. But the biggest shock of all: other people …

The post 30 Years In IT…You’d get Less for Murder! appeared first on .

]]>
A few weeks ago marked 30 years since yours truly started out as a graduate IT trainee at Royal Insurance (now RSA) in 1988.

Back then, there were no PCs, just dumb terminals; no mobile phones, just landlines; no email, no internet. But the biggest shock of all: other people could smoke at your desk!

Anyway, I thought I’d put together some thoughts looking back over the ‘first 30 years’. So, in no particular order, here goes…

Royal Insurance- Tech Industry

 

Training, Skills Shortage & Staff Retention 

The main reason I accepted the job offer from Royal Insurance was the training programme. I had an unblemished track record of being truly crap at both BBC Basic and Pascal whilst at Leeds Uni. Anyone willing to spend 3 years training me from scratch as an analyst/programmer and pay me at the same time is more than welcome to give it a shot, I reasoned.

The training I received at ‘The Royal’ was, and is, second to none. It was accredited and audited by the British Computer Society (BCS) and was like studying for another degree. I felt like giving up a few times – writing Cobol to run on an IBM mainframe will do that – but I managed to get through it intact. The dropout rate was remarkably low.

The old adage ‘Hire for attitude, train for skills’ is something I strongly believe in to this day. It guides all of our candidate selection at VLDB. For those that won’t make the training investment ‘in case they leave’, I offer Richard Branson’s guidance. Fine words indeed.

Stop moaning about skill shortages, find & train people and treat them well. Simples! 

 

Application Support

My first role at Royal was in the accounts support team. Right from the off, the importance of best practices and ‘good code’ as a long-term cost reduction strategy really resonated. Little did I know that I was being schooled in ‘technical debt’ avoidance right at the start of my career.

A lengthy stint at Lloyds TSB in the 90’s included several years in the application support team. The Teradata best practice guidelines developed by the application support team over 20 years ago are still in use today, albeit in modified form. Support isn’t just about fixing broken code!

Put good people into application support and pro-actively coach developers and enforce standards to avoid technical debt.

 

Teradata
Teradata - Data Tech Industry

During the latter stages of my time at Royal, I moved to the ‘reporting’ team that developed MI applications on a new-fangled Teradata system. Little did I know we were amongst the earliest adopters in the world, right here in Liverpool.

Having run my first Teradata SQL query 28 years ago, I’m still staggered just how much Teradata got right from the start. It’s a truly remarkable achievement when you realise the fundamentals haven’t had to change in all that time.

In addition to Teradata getting so much right so long ago, there have also been no major technology missteps along the way. I can’t remember anything being consigned to the ‘Oops, what were we thinking?’ pile.

A big ‘well done’ to the original Teradata folks. Very fine work indeed.

  

Data Models

Several years ago I was dispatched to a client to help referee a disagreement between two camps: the architecture team and the data modeling team. 

The data modelers worked in a vacuum and did ‘data modeling by the book’ in order to implement an industry-specific vanilla data model they’d bought. Academic purity was all that mattered.

This didn’t sit well with the architecture team who pointed out time-to-market and user dissatisfaction was being sacrificed at the altar of puritan data modeling beliefs.

After interviewing dozens of stakeholders, and ruminating on the standoff, I decided the data modelers were causing real pain and suffering. Furthermore, their behavior was largely an unavoidable consequence of having bought a vanilla model. I’ve since seen this play out at several client sites, some of whom really should have known better. 

A vanilla data model is not a silver bullet. Be prepared to build a separate, physical, semantic layer…which will probably look a lot like your original home-grown data model.

 

MPP, Hadoop & Critical Analysis

Teradata’s eponymous MPP database started shipping in 1984.

Google’s MapReduce white paper was published 20 years later in 2004. Big G was subsequently awarded a MapReduce patent which was criticised by De-Witt and Stonebraker for lack of novelty, citing Teradata as prior art. I side with Dave and Mike on this one.

Anyway, inspired by Google’s MapReduce and Google File System (GFS) research, Doug Cutting and the good folks at Yahoo! gave rise to the open source Apache Hadoop framework in 2011. Companies such as MapR, HortonWorks, and Cloudera were formed specifically to monetise Hadoop.

Over the last decade we’ve witnessed the all-too-familiar tech industry chain of events in the Hadoop world: the latest tech silver bullet gets VC backing; sales & marketing money flows; sales reps jump ship to the newest game in town; analysts publish gushing praise (not paid for, no sir!); conferences are held; fanbois gulp down the Kool-Aid;  management jump on the FOMO bandwagon and POCs are hastily scheduled.

The Hadoop projects we’ve been involved in have almost all involved moving off Hadoop onto something easier to set up, use and manage. SQL queries that run economically at any scale are the requirement. Hadoop isn’t the answer, no matter how much effort the Hadoop slingers put into SQL-on-Hadoop. We’ve had a scalable, SQL compliant, MPP database for over 30 years, remember?

I stand by the assertion made early last year: for most folks most of the time an MPP database will deliver the business requirement. Hadoop simply isn’t needed.

VCs, analysts, sales reps, conference organisers and fanbois seem to be able to nullify any attempt at critical thinking in the tech using community. 

 

Keep It Simple Stupid (KISS)

The old ‘KISS’ adage is the guiding principle I adhere to most. For the non-believers, I offer the following KISS disciples for you to argue with:

‘Simplicity is the ultimate sophistication’ – Leonardo Da Vinci

 ‘Everything should be made as simple as possible, but no simpler’Albert Einstein

 

On that note, thanks for your time…I’m off to wrangle some thoroughly modern Python machine learning code. Cobol? No thanks!

Source-

  1. Royal Insurance – https://www.rsagroup.com/
  2. Teradata – https://www.teradata.co.uk/

 

The post 30 Years In IT…You’d get Less for Murder! appeared first on .

]]>
Gaming, Data, and Behaviour http://www.vldbsolutions.com/blog/gaming-data-and-behaviour/ Thu, 18 Oct 2018 09:16:46 +0000 http://www.vldbsolutions.com/blog/?p=1503 From time spent in certain game modes, to player damage with weapons, and how players move around maps, video game developers have had access to a wide range of telemetry data for years now. They’ve been using this data to refine the core feedback loops they use to keep players …

The post Gaming, Data, and Behaviour appeared first on .

]]>

From time spent in certain game modes, to player damage with weapons, and how players move around maps, video game developers have had access to a wide range of telemetry data for years now. They’ve been using this data to refine the core feedback loops they use to keep players engaged and coming back for more, driving sales and improving player experiences. It’s clearly working too, with the US video game industry alone generating $108.4 billion in 2017.

 

Source: Newzoo | April 2018 Quarterly Update | Global Games Market Report

A large section of this revenue is generated through micro-transactions – power ups or cosmetic items bought in game – either outright with your hard-earned cash or using in game currency generated through play. This is a concept many people are increasingly familiar with. Many people spend their morning commute playing Candy Crush on their phone, where the option to buy extra lives or special block busting abilities is paraded in front of you.

The genius of these games is not that they include a store, it is that the store is paired with well-designed difficulty loops. The game is designed to give the player a few easy levels where they begin to feel comfortable, followed by one that they can’t quite seem to get past, no matter how hard they try. With every failure, frustration builds, and the developer offers you an easy out for the price of a coffee. When viewed from this angle it’s easy to see why players turn to the in-game store in droves. After all, think of the rush you get from finally beating the level and moving on from Sugar Plum Village. Except of course, a few levels down the road the process repeats, and you find yourself drawn towards that store again.

This is fantastic for investors and executives, ever looking to make the company more profitable and increase returns, but the increased pressure for profit comes with a very human price. ‘Whales’ – as the industry calls them – are consumers willing to spend thousands of pounds on micro-transactions in their favourite games, sometimes at the expense of other areas of their lives. Unfortunately, they are not as rare as you might think. After all there are thousands of gambling addicts, and if you reread the previous paragraph pretending instead that it refers to fruit machines found in pubs, can you honestly say that they are very different?

This is something that has recently come to the fore in various European countries, most notably after the Belgian Gaming Commission began investigating the legality of so called ‘Loot Boxes’ in the game Star Wars: Battlefront 2. This was followed in February of this year by Hawaii, introducing legislation to limit the sale of games that include Loot Boxes to those over the age of 21, and the Netherlands, declaring some implementations of Loot Boxes to be illegal under the Betting and Gaming Act in April.

Artist: Sandro Pereira – The products or characters depicted in these icons are © by Nintendo

Loot Boxes are a method of micro transaction being pushed quite hard by some titans in the industry such as Electronic Arts (EA) who publish a wide range of incredibly popular games, ranging from the well-known FIFA franchise to The Sims. The idea is quite simple, you pay an upfront price for a randomised pack of in game items, be they new football players for your team as is the case in FIFA, or new weapons and characters in Star Wars: Battlefront 2. Certain items gained are better than others, but you won’t know if you have something good or not until you have bought and opened your loot box. The better the item is, the less likely it is to appear in your box, and so the more you have to buy in the hopes of finding that one item that you’re sure you need to crush your opponents.

The price of loot boxes, the frequency of items within them, and the effect they have on gameplay are all carefully calculated using data collected over years to ensure player engagement, retention, and spending. If the cost is too high, if the rarest items are spread too far apart, or if even the best items unbalance the game too much, then you run the risk of players turning their backs and finding new holes to throw their money into. Find that sweet spot though, and it’s possible to rake in money hand over fist. All of this has more and more government bodies sitting up and taking notice, and some in the industry fear that the aggressive push for more profit runs a risk of bringing this gravy train to a screeching halt.

Using flashing lights and the promise of a potential reward is a well-documented way to abuse the weakest parts of our nature. When this is combined with the sheer quantity of data available to fine tune the experience, we shouldn’t be surprised to find that what is a perfectly reasonable hobby to many is in fact a nightmare for some. Indoctrinating us into a gargantuan gambling machine is not the only way this method has been used though. Gamification, as it’s also known, has been put to great effect in the pursuit of knowledge and the betterment of people.

Duolingo, a language learning app, utilises the same principles of data collection and feedback loops but leverages it to help users learn a new language. By breaking the language down into small sections, giving players badges as rewards for completing those sections, and counting the days in a row the user has used the app, they encourage people who might otherwise lose the motivation to learn a language to keep returning and improve themselves. Another fantastic example is the use of crowdsourcing to help scientists with their research, sometimes called Citizen Science. One of the earliest adopters of this method was a game called Foldit, released in 2008 by a team from The University of Washington. The game asked players to manipulate pieces of proteins in 3-D to create the most stable shape possible, with players being ranked by the scores calculated from their designs. These designs were used by the scientists to help solve real problems in the lab.  Among other achievements, players have helped scientists solve the structure of M-PMV which is a crucial protein in HIV. They managed this within just 3 weeks, while scientists using traditional methods had been scratching their heads for more than a decade.

Given that without a major shift in our society, large organisations will continue to collect and utilise our data on ever growing scales, perhaps we shouldn’t be so caught up in asking who is using our data. Perhaps a more useful question would be what are they doing with it, asking if they are using it in the best way they could, and being more rigorous in holding them to account when they fail to do so.

The post Gaming, Data, and Behaviour appeared first on .

]]>
Mini-disks, Sabre-toothed Chickens and Pizza Grease – a Deadly Serious Discussion of Technology and Prediction. http://www.vldbsolutions.com/blog/a-deadly-serious-discussion-of-technology-and-prediction/ Wed, 10 Oct 2018 15:00:48 +0000 http://www.vldbsolutions.com/blog/?p=1454   “All fixed, fast frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new formed ones become antiquated before they can ossify. All that is solid melts into air…” -Karl Marx (1967. source1) When Marx wrote this, he was talking about the sweeping …

The post Mini-disks, Sabre-toothed Chickens and Pizza Grease – a Deadly Serious Discussion of Technology and Prediction. appeared first on .

]]>
 

“All fixed, fast frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new formed ones become antiquated before they can ossify. All that is solid melts into air…” -Karl Marx (1967. source1)

When Marx wrote this, he was talking about the sweeping social, economic and demographic changes that followed the industrial revolution, but he may as well have been talking about the folly of buying a mini-disk player in 1998.  Change is not only inevitable, it is also unpredictable and frequently inconvenient, and no-where is this more frequently and rapidly proven than in the technology and information sectors.

In an effort to keep track of all the changes in his life, some bright caveperson started writing them down, the end result of which being the over 2,500,000,000,000,000,000 estimated bytes of data produced every day by his descendants. You’d think that with all of this data to-hand that predicting the next big change would be easier, but then why does everyone on TV look so surprised all the time? Flabbergastedness is endemic in our broadcasters: Election upsets, economic crashes, the winner of Ru Paul’s Drag Race, everything seems less predictable than it ever was. Are we, as a society, gathering the wrong data? Are we analysing it wrong? Or, like an experiment in quantum uncertainty, does the very act of gathering data on these things affect them, and make them less predictable? For example: If we know our preferred candidate/stock/drag queen is the favourite to come out on top, will it change our behaviour in a way that actually makes it less likely?So, do we go back to counting our sabre-toothed chickens on our fingers? I’m hesitant to blame the data itself, and not just because I prefer being sat at a desk to in a cave, but because the data available doesn’t indicate that the data is to blame. You might say ‘well the data would say that, while it’s looking guilty, wouldn’t it?’, and here we reach the crux of the problem. Data isn’t people, people are people.

Data
‘Sorry, dude.’ (source2)

Now I’ve nothing against people, I was one myself for a while, but the hardware is outdated and the drivers are mostly work-arounds these days and the ‘Culture’ and ‘Society’ OS updates introduced so many black-box algorithms that it’s near impossible to understand the processes they perform to create the output that they do. And don’t even get me started on the incomplete documentation.

Given all the above, it’s hard not to come to the conclusion that the next great leap forward in the predictive use of data will be in replacing people with something that can, without unintended bias, read, process and, crucially, extract meaningful information from, a larger proportion of those 2.5 quintillion bytes. I am of course describing artificial intelligence.

A.I. is a hot topic at the moment. Too hot to handle? Not for some, eager to burn the roofs of their mouths with the still bubbling pizza grease of progress; the computer scientists working for many governments and in silicon valley strive ever towards the creation of a  true thinking machine. However, as highlighted recently by such luminaries as Stephen Hawking and Tim Berners Lee, and continually by just about any long running sci-fi franchise you can name, there is great irony in the fact that in order to create the most effective predictive technology, we must achieve something the results of which are wildly, and dangerously, unpredictable.

Source-

1. Karl, M. and Engels, F., 1848. The Communist Manifesto. London: Communist League.

2. Roddenberry, G, 1966. Data [Online]. [Date Accessed: 10.10.18] Available from: http://www.startrek.com/database_article/data

The post Mini-disks, Sabre-toothed Chickens and Pizza Grease – a Deadly Serious Discussion of Technology and Prediction. appeared first on .

]]>
VLDB DBSee on GitHub http://www.vldbsolutions.com/blog/vldb-dbsee-on-github/ Thu, 17 May 2018 10:20:40 +0000 http://www.vldbsolutions.com/blog/?p=1433 It’s far from easy to police a busy Teradata system – even those that think they’re on top of best practice can let standards slip over the years.   Teradata Best Practice VLDB have been developing Teradata best practice beliefs since 1989. To support this, we have worked to develop …

The post VLDB DBSee on GitHub appeared first on .

]]>
VLDB-LogoIt’s far from easy to police a busy Teradata system – even those that think they’re on top of best practice can let standards slip over the years.

 

Teradata Best Practice

VLDB have been developing Teradata best practice beliefs since 1989. To support this, we have worked to develop a program that consists of a set of SQL scripts that run against the Teradata data dictionary; each script can be used to inform a user of the current state of their Teradata system.

DBSee is now available on GitHub for all to view and use at will – and it doesn’t cost a penny.

DBSee Benefits

The key DBSee deliverable is a set of results displayed within your native database client tool. This can also be output as a .csv file and imported into Tableau for easy-to-understand visualisations.

We have yet to find a Teradata system that doesn’t fail nearly all of the DBSee best practice checks.

The great news is that all best practice violations can be fixed relatively easily. The first step is to uncover them with DBSee.

Head to our GitHub page to retrieve all of our SQL scripts and for more information on how to run DBSee: https://github.com/VLDB-Solutions/DBSEE

The post VLDB DBSee on GitHub appeared first on .

]]>
PostgresConf 2018 Musings http://www.vldbsolutions.com/blog/postgresconf-2018/ Tue, 01 May 2018 08:30:19 +0000 http://www.vldbsolutions.com/blog/?p=1378 Yours truly attended the Postgres Conf in Jersey City last week with my esteemed colleague Richard Jackson. It was quite a blast, to say the least…apart from not getting home until late Saturday afternoon. Ho hum. PostgresConf 2018, Jersey City, USA PostgreSQL Re-Cap PostgreSQL, or Postgres for short, is an open source …

The post PostgresConf 2018 Musings appeared first on .

]]>
VLDB-LogoYours truly attended the Postgres Conf in Jersey City last week with my esteemed colleague Richard Jackson. It was quite a blast, to say the least…apart from not getting home until late Saturday afternoon. Ho hum.

PostgresConf 2018, Jersey City, USA

Pivotal Postgres

PostgreSQL Re-Cap

PostgreSQL, or Postgres for short, is an open source database first developed as a follow-on from Ingres. The Postgres prototype was shown at the ACM SIGMOD conference in 1988. Yes, that’s 30 years ago.

During the early 2000’s, Postgres was frequently used as the basis to develop ‘scale-out’ clustered analytic platforms. Without Postgres there would be no Netezza, Paraccel, Redshift, Aster Data or Greenplum. So, even the new-fangled Postgres-based MPP platforms out there have a 15 year heritage.

Parallel Postgress

The senior team at VLDB (you know who you are!) were early adopters of all things ‘parallel Postgres’. We delivered our first Netezza project in 2003 at Caudwell Communications, the fixed-line telecoms division of Phones-4-U.

As early as 2003-04 we were also engaged with the early incarnations of Greenplum. We were there during the Metapa & Bizgres days. Scott hasn’t aged a single day. Go figure!!!

Greenplum was MPP, which we liked. It was a ‘parallel Postgres MPP’, which we really liked. Most of all, unlike the other MPP offerings we’d used, Greenplum was – and still is – a ‘software only’ play. Now that’s *really* interesting to a consulting outfit like VLDB that likes to ‘roll your own’.

Greenplum Re-cap

After being founded in 2003, Greenplum the company was acquired by EMC in 2010.

Pivotal Software was formed in 2013 as a spin-out from EMC and VMWare. Greenplum was duly added to Pivotal’s software portfolio, hence ‘Pivotal Greenplum’.

Pivotal decided to open source Greenplum in 2015. Greenplum thus became the ‘World’s First Open Source MPP Data Warehouse’, which we like, a lot. Did we mention that?

Back to PostgresConf 2018

Day 2 of the PostgresConf 2018 was the ‘Greenplum summit‘.

The day kicked off with ‘Greenplum: Building a Postgres Fabric for Large Scale Analytical Computation‘ presented by our good friend Jacque Istok, and our new friend Elisabeth Hendrickson.

Immediately following Jacque and Elisabeth was a presentation by Howard Goldberg from Morgan Stanley entitled ‘Greenplum: A Pivotal Moment on Wall Street‘. Jaw-dropping tales of 20PB (10x compressed) Greenplum databases aside, Howard is perhaps the funniest and most engaging speaker we’ve ever had the pleasure to encounter at a conference, and we’ve been to *lots* of conferences.

‘Fessing up to having Pivotal’s Greenplum product manager Ivan Novick on speed-dial ahead of Mrs Goldberg had the audience in stitches. Fine work Howard!!!

How do you follow Howard? Well folks, Shaun Litt from Conversant Media did a mighty fine job with a session entitled ‘Greenplum for Internet Scale Analytics and Mining‘. We’re not easily impressed when it comes to all things ‘big’ data, but when Shaun casually dropped ‘quintillions of rows’ into the session we all sat up and took notice…and then Googled just how big a quintillion actually is.

After a spot of lunch we were treated to ‘Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum‘ by Frank McQuillan and Bharath Sitaraman from Pivotal. A core component of the Greenplum offering is scalable, in-database machine learning, graph, analytics & statistics enabled by Apache MADlib. Pivotal has also put a lot of engineering effort into text search via the Apache Solr based GPText.

The final Greenplum summit session we attended was ‘Pivotal Greenplum in Action on AWS, Azure, and GCP‘ by Pivotal’s Jon Roberts. This was of particular interest given that team VLDB have been deploying Greenplum on public cloud platforms such as Profitbricks, AWS, Azure and Google for years. Welcome to the public cloud party chaps!

As an aside, Jon maintains a most excellent blog at http://www.PivotalGuru.com. Highly recommended.

Pivotal IPO

After a very entertaining evening meal and a ‘few drinks’ with the folks from Pivotal and Blue Talon, on the Friday we headed over to the Pivotal office in NYC. This was no ordinary Friday for our friends at Pivotal (‘Pivots’), no sir. Today was the day Pivotal went public via an IPO on the NASDAQ.

A big ‘well done’ to Greenplum founders Bill Cook and Scott Yara for their roles in bringing Pivotal to IPO. Oh, and ‘Happy Birthday’ to Bill for this week.

The wine, champagne and strange blue drinks were handed out a-plenty at the Pivotal NYC office. A unique way to end a tech conference, for sure.

If you look carefully, you’ll probably spot a couple of VLDB interlopers in the NYC team photo. Sorry for blocking the folks stuck behind us, it was quite a crush!

Conference Takeaways

Although this was a Postgres conference, the main focus for VLDB was the Greenplum summit. We’ve been fans and users of Greenplum database (GPDB) for a *long* time.

The IRS, Morgan Stanley and Conversant can’t all be wrong, surely?

Pivotal has gone ‘all in’ on Greenplum as the open source MPP platform to support analytics at any scale and complexity. It is perhaps informative to note that their is no mention of Hadoop on the Pivotal product page. No way did we call this out early last year…

And Finally…

In addition to the usual pillaging of the Macy’s mens department, we managed to bring some Pivotal ‘swag’ home to the UK. Our two labradors, Molly and Ruby, enjoyed the foam rockets perhaps a little too much. A big thanks from both of them!

The post PostgresConf 2018 Musings appeared first on .

]]>
Google, SQL & noSQL Ramblings http://www.vldbsolutions.com/blog/google-sql-nosql-ramblings/ Tue, 03 Oct 2017 11:12:23 +0000 http://www.vldbsolutions.com/blog/?p=1365 A couple of articles on Google and SQL versus noSQL caught my eye recently and inspired me to go on a bit of a ramble, so here goes.. Google’s Head Fake Firstly, there’s the splendidly titled “Did Google Send the Big Data Industry on a 10 Year Head Fake?” Apparently a …

The post Google, SQL & noSQL Ramblings appeared first on .

]]>
Teradata AWS AccessA couple of articles on Google and SQL versus noSQL caught my eye recently and inspired me to go on a bit of a ramble, so here goes..

Google’s Head Fake

Firstly, there’s the splendidly titled “Did Google Send the Big Data Industry on a 10 Year Head Fake?” Apparently a ‘head fake’ is to trick your opponent as to your intentions. Probably akin to ‘dropping the shoulder’ in football.

So, the question is, did the Big G send us all the wrong way for a decade? Short answer: no. Longer answer: if you did march the wrong way for several years, have a word with yourself.

The article quite rightly points out that neither distributed file systems nor distributed computing can be attributed to Google. As Curt Monash succinctly stated on DBMS2 in 2010, ‘popularization != invention‘.

However, the thinking goes that Google’s famous research papers gave rise to Doug Cutting efforts at Yahoo, which led to Hadoop, whilst all the while the Big G was working on plain old SQL-based Spanner.

The Big G may have been working on other SQL-based stuff whilst the Hadoop crowd were beavering away, but does this amount to a head fake or a shoulder drop? No, not in the slightest.

Many considered the old-fangled SQL-based data warehouse to be under immediate threat due to the rise of the ‘new’ MapReduce-based SQL-free computing paradigm. Shiny & new always wins, right?

Sadly, that can be the case. I’ve witnessed ‘big data’ POCs at first hand where the main motivation was to be seen to be ‘doing big data’. I kid you not. Forget the impact on the users and existing toolsets, stop worrying about ROI, this stuff is free and cool, the vendor is saying all the right things, so it’s definitely the way to go. Woo-hoo!!!

Let’s never forget, there are armies of folks looking to monetise anything that moves in IT. The VCs, start-ups, analysts and conference organisers make their living out of getting mere mortals to believe they have the silver bullet that’s been missing until now.

What worked for Google, Yahoo, Facebook and LinkedIn doesn’t necessarily translate to the mainstream – this doesn’t make it bad tech. Hadoop is a great example. Out in the real world, SQL is the answer for most analytic applications for most folks most of the time, plain and simple.

If you believed the hype and ‘Hadoop’ didn’t turn out to be your silver bullet, don’t blame the Big G for sharing their thoughts or others for building/promoting ‘shiny & new’. No-one told you to put your finger in the fire, did they?

Why SQL Is Beating noSQL

The article entitled ‘Why SQL is beating NoSQL, and what this means for the future of data‘ really got my attention. The notion that SQL was ‘left for dead‘ and is ‘making a comeback‘ is somewhat at odds with reality, me thinks.

There may have been a belief in certain quarters that the ‘answer’ was no longer SQL, but this has never been reflected out in the real world where regular folks run queries, and lots of them.

SQL is very deeply embedded in every single organisation I’ve encountered since the late 1980’s. I don’t see this changing any time soon. There is simply too much sunk cost and too little benefit in trying to ditch SQL.

Software developers may have ‘cast aside SQL as a relic’ but that certainly isn’t what happened throughout corporate IT organisations or analyst communities.

Its clearly untrue that SQL ‘couldn’t scale with these growing data volumes‘. We’ve had scale-out Teradata MPP systems chomping on SQL since the 1980’s, and newer MPP & SQL players like Netezza & Greenplum for over 15 years. A low industry profile doesn’t mean something doesn’t exist.

As the article states, the developers of SQL recognised that ‘much of the success of the computer industry depends on developing a class of users other than trained computer specialists.‘ which is partly why it has become the de facto standard for interacting with databases.

Apparently ‘…SQL was actually fine until another engineer showed up and invented the World Wide Web, in 1989.‘ I’ll help out here. His name is Sir Tim Berners-Lee OM KBE FRS FREng FRSA FBCS. He’s quite well known.

There is no doubt that mainstream general purpose databases (SQL Server, Oracle, mySQL etc) struggled to cope with data volumes thrown out by all things digital in the new-fangled age of ‘the web’ and ‘the net’.

Some of us were already scaling out (admittedly expensive) MPP systems running SQL databases as early as the 1980’s. Big name retailers, banks and telcos have been running SQL on scale-out database systems for decades.

My first multi-billion row Teradata table is over 20 years old and still alive and well at a retail bank. A 100 billion row telco CDR table is over a decade old and runs on a parallel PostgreSQL system. How much scale do you want?

The rational behind attempts to ditch legacy SQL databases is neatly summarised: ‘NoSQL was new and shiny; it promised scale and power; it seemed like the fast path to engineering success. But then the problems started appearing.’

The likes of DeWitt & Stonebraker know a thing or two about this stuff and were early nay-sayers. Feel free to disagree, obviously, but dismiss their observations at your peril.

Most of the post-Teradata attempts to develop scale-out SQL-compliant databases have leveraged PostgreSQL. This approach dates back to Netezza over 15 years ago, and includes the mega-popular AWS Redshift.

The light-bulb moment and conversion to SQL is something Teradata went through over 30 years ago: ‘One day we realized that building our own query language made no sense. That the key was to embrace SQL. And that was one of the best design decisions we have made. Immediately a whole new world opened up.’

Robb Klopp covers Chuck McDevitt’s ‘SQL is the answer’ light-bulb moment whilst at Teradata in 1985 in Chuck’s obituary. RIP Chuck, without doubt owner of the biggest brain I ever met.

The King Is Dead. Long Live The King.

SQL didn’t die. It didn’t recede. SQL is far from perfect, but it isn’t likely to go away any time soon.

Kool-Aid drinkers took themselves on a detour into the land of file processing via developer-only languages all of their own accord, whilst all the time mainstream IT organisations and user communities carried on quite happily with boring old legacy SQL. The same SQL that runs the world.

The first time I ran an SQL query on Teradata in the late 1980’s I realised I was immediately free from the pain and suffering of developing Cobol for analytics. To this day I can clearly remember the relief (sorry Grace).

It seems a certain section of the IT community is going through a similar realisation: if you want to play with data, SQL was, is, and should be the default starting place.

The King is Dead. Long live the King.

The post Google, SQL & noSQL Ramblings appeared first on .

]]>
Azure IO Testing http://www.vldbsolutions.com/blog/azure-io-testing/ Wed, 01 Mar 2017 16:00:22 +0000 http://www.vldbsolutions.com/blog/?p=1253 Azure IO Testing One of the first things we do here at VLDB when we have a yearning to set up a database on a cloud platform is to develop an understanding of disk input-output (IO) performance. For databases in particular, we are specifically interested in random read performance. A …

The post Azure IO Testing appeared first on .

]]>
Teradata AWS AccessAzure IO Testing

One of the first things we do here at VLDB when we have a yearning to set up a database on a cloud platform is to develop an understanding of disk input-output (IO) performance.

For databases in particular, we are specifically interested in random read performance. A high random read throughput figure is necessary for an efficient and performant MPP database.

Microsoft’s Azure is the test platform in question. The approach taken is to commission a single CentOS7 VM, attach various disks and measure IOPS & throughput.

Azure VM Setup

A single Centos 7 VM is configured. Disks are added in he required size, type & number in order to maximise IO throughput at the VM level:

Azure OS Console

The starting point on Azure is SSD disks, as the IO figures for SSD are generally superior to HDD.

Disk size determines both throughput and available IOPS, so this will also need to be considered in the overall build:

Azure Disk Console

 

Azure VM & Disk Choices

For the 1TB SSD selected above, IOPS are limited to 5,000/sec and throughput is limited to 200 MB/s.

As an aside, cloud providers usually fail to distinguish between read & write, sequential & random or specify IO size when they bandy about throughput figures. The devil is in the detail, dear reader(s).

In order to use premium storage it is necessary to specify the account as ‘Premium LRS’.

The range available that is hosted in the UK is limited so if you have a requirement for UK based system, make sure that what is on offer from Azure will satisfy the requirement:

Azure UK Console

Note that only certain VM types support SSD disks.

For the basic D14v2 with 16 CPU cores, 112Gb RAM and 800 Gb disk, it is not possible to specify SSD disks. However, the Ds14V2 VM type will allow these ‘Premium’ disks.

Also note that the 800GB is for the operating system and swap space only – the storage is non-persistent on this device specifically.

The D14 also allows up to 8 NICs and has an ‘extremely high’ network bandwidth. This is useful if the requirements is to build a MPP cluster to facilitate high speed node:node data transfer.

The Ds14V2 VM is limited to 32 disks:

Azure ds14v2

The Standard_DS14_v2 VM is limited to 64K IOPS or 512MB/s in cached mode (host cache = read or read/write) or 51K IOPS (768MB/s) in uncached mode (host cache = none).

The P30 1Tb SSD disks used are rated at 5K IOPS (200Mb/s) max per disk:

Azure P30

Azure IOPS Testing

Booting CentOS7 and upgrading the kernel takes a matter of minutes.

FIO is then downloaded using git and installed to perform the IOPS tests:

git clone –branch fio-2.1.8 http://git.kernel.dk/fio.git /fio

To run a FIO throughput test, simply enter following from the fio directory:

./fio –rw=randread –name=<directory>/fio –size=<size>g –bs=32k –direct=1 –iodepth=16 –ioengine=libaio –time_based -–runtime=<elapse> –group_reporting

  • <directory> – the directory (disk) where the tests are executed.
  • <size> – size of the test file created in GB.
  • <elapse> – elapse time for the test.
  • bs=32k – blocksize (default 4k). 32Kb is a good size to test for database suitability.
  • direct=1 – non-buffered IO.
  • iodepth=16 – concurrent IO threads.
  • ioengine=libaio – native Linux asynchronous IO.
  • time_based – test elapse time.
  • group_reporting – report on group total not individual jobs.

For example:

./fio –rw=randread –name=/tests/disk1/fio –size=1g –bs=32k –direct=1 –iodepth=16 –ioengine=libaio –time_based –runtime=30 –group_reporting

Azure Disk Read Results

Multiple random & sequential read tests were executed against a Standard_DS14_v2 VM with between 1-6 SSD disks of P30 1TB SSD per disk. Blocksize is 32Kb in all cases.

As stated earlier, we are primarily, but not exclusively, interested in read and not write performance when assessing a cloud platform. This is due to the fact that 80-90% of all data warehouse IO activity is read, and only 10-20% is write. If the read performance isn’t good enough we don’t really care about write performance as the platform is probably unsuitable.

Sequential and random read results using fio on Azure are as follows:

Azure IOPS Results

Azure Read Performance Observations

Random and sequential read give similar performance. This is both unusual and most welcome for those of us that rely on random read performance for data warehouse systems!

At the VM level, total disk throughput does not increase as more disks are added to the VM. Disk throughput is limited by the VM type and/or user account type. Adding more disks to the VM is not a way to deliver more IOPS.

Approximately 540-570MB/s random read throughput with a 32KB blocksize was achieved fairly consistently for between 1-6 P30 1TB SSDs attached to a Standard_DS14_v2 VM.

This is slightly higher than the stated 512 MB/s limit for a Standard_DS14_v2 VM, and represents a very good random read throughput figure for a single disk.

However, it is disappointing that this figure can’t be increased nearer to 1,000MB/s (which is attainable on AWS) through the use of more than 1 disk per VM.

The post Azure IO Testing appeared first on .

]]>