Google’s Dataproc service gets GPUs and management automation features


Dataproc is an analytics service from Google LLC that allows enterprises to spin up managed Spark and Hadoop environments in the cloud. Today, the search giant updated the service with four features that promise to provide a boost for machine learning projects as well as simplify day-do-day maintenance.

Companies using Dataproc for machine learning can now add graphics processing units to their Hadoop and Spark clusters.

GPUs run artificial intelligence models many times faster than a standard central processing unit, which should translate into a performance boost for users. Google provides eight Nvidia Corp. data centers GPUs to choose from in its public cloud including the chipmaker’s top-end Tesla V100 model.

Also new to Dataproc is autoscaling. The service can now automatically dial the size of a cluster up or down depending on how many hardware resources a workload requires at a given moment.

The autoscaling mechanism comes handy in several situations, according to Google. It makes it easier to deal with abrupt usage spikes such as an increase in the volume of data that an analytics application sends to a Spark deployment. Meanwhile, an engineer looking to scale up an algorithm they’ve successfully deployed on a small test cluster can do so without having to manually provision the extra infrastructure they need. 

“The cluster will simply grow to the size needed to process the full dataset and then scale itself back down when the processing is completed,” explained Chris Crosbie, a director of product management with Google’s cloud analytics group. “You don’t need to waste time trying to move over to a larger server environment or figure out how to migrate your work.”

Google used the occasion to add a couple other features meant to help companies operate their Dataproc clusters more efficiently. The first addition, a new configuration option, makes it possible to set a limit on how long a cluster can run idly and have Dataproc automatically delete it if the threshold is reached. The other new feature lets companies automate certain tasks in SparkR, an extension for Spark that provides the ability to run R programs on the framework. 

Image: Google

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.





Source link