Filter by products, topics, and types of content

(430 Products)

(755 Topics)

(19 Industries)

(15 Types)

1 - 48 of 48 results
Show Summaries | Hide Summaries
View Results
Title none Type none Date down
Cleansing, processing, and visualizing a data set, Part 3: Visualizing data
In this tutorial, discover some of the more useful applications for visualizing data and a few of the approaches you can use to create that visualization, including the R programming language, gnuplot, and Graphviz.
Also available in: Chinese  
Articles 17 Jan 2018
Cleansing, processing, and visualizing a data set, Part 2: Gaining invaluable insight from clean data sets
Learn about VQ and ART algorithms. VQ quickly and efficiently clusters a data set; ART adapts the number of clusters based on the data set.
Also available in: Chinese  
Articles 04 Jan 2018
Cleansing, processing, and visualizing a data set, Part 1: Working with messy data
Discover common problems associated with cleansing data for validation and processing, with solutions for dealing with them. You'll also find a custom tool to make the process of cleansing data and merging data sets for analysis.
Also available in: Chinese  
Articles 14 Dec 2017
Speaking out loud
Natural language processing and other artificial intelligence-related technologies are all around us. Discover how the science began and where it might go in the future.
Also available in: Chinese   Japanese  
Articles 13 Jun 2017
Fight against SQL injection attacks
In the world of security exploits, one vulnerability, although easily resolved, is number one on the OWASP top 10: the Structured Query Language (SQL) injection attack. Although this class has existed since 1995, it remains one of the most prevalent attacks on web assets. Get to know the SQL injection attack and discover how it's carried out on a production website. Then learn how to test a website for this class of vulnerability by using IBM Security AppScan Standard.
Also available in: Russian   Japanese  
Articles 04 Feb 2014
Extract information from the web with Ruby
Explore the latest methods for extracting structured information from the web. Using Ruby script examples, author M. Tim Jones demonstrates scraping technology and the use of web APIs for targeted data retrieval.
Also available in: Chinese   Russian   Japanese  
Articles 17 Dec 2013
Recommender systems, Part 1: Introduction to approaches and algorithms
Most large-scale commercial and social websites recommend options, such as products or people to connect with, to users. Recommendation engines sort through massive amounts of data to identify potential user preferences. This article, the first in a two-part series, explains the ideas behind recommendation systems and introduces you to the algorithms that power them. In Part 2, learn about some open source recommendation engines you can put to work.
Also available in: Chinese   Russian   Japanese  
Articles 12 Dec 2013
Recommender systems, Part 2: Introducing open source engines
Part 1 of this series introduces the basic approaches and algorithms for the construction of recommendation engines. This concluding installment explores some open source solutions for building recommendation systems and demonstrates the use of two of them. The author also shows how to develop a simple clustering application in Ruby and apply it to sample data.
Also available in: Chinese   Russian   Japanese  
Articles 12 Dec 2013
Simplifying scalable cloud software development with Apache Thrift
Apache Thrift is a framework that enables scalable cross-language development, resulting in unambiguous communication among components in cloud environments. This article introduces the ideas around Thrift (an interface definition for remote procedure call with multilanguage bindings), and then demonstrates Thrift in a multilanguage client and server application.
Also available in: Russian   Japanese   Portuguese  
Articles 12 Nov 2013
Static and dynamic testing in the software development life cycle
Yesterday, the idea of application security was mostly an afterthought. But given the plethora of news on hacking and underground economies for exploits, security testing is now an integral part of the software development life cycle. This article explores two aspects of security testing and the open source tools that simplify their execution.
Articles 26 Aug 2013
Data science and open source
Data science combines mathematics and computer science for the purpose of extracting value from data. This article introduces data science and surveys prominent open source tools in this rapidly growing field.
Also available in: Russian   Japanese  
Articles 09 Aug 2013
Weaving data visualizations with the Weave platform
Weave is a new platform for the visualization of trend and geographical data, developed at the University of Massachusetts Lowell. Weave supports a wide range of uses and is intended for both novice and advanced users. Explore the use of Weave for visualizing data from publicly available repositories with the hands-on examples in this article.
Also available in: Japanese  
Articles 23 Apr 2013
Process real-time big data with Twitter Storm
Storm is an open source, big-data processing system that differs from other systems in that it's intended for distributed real-time processing and is language independent. Learn about Twitter Storm, its architecture, and the spectrum of batch and stream processing solutions.
Also available in: Chinese   Russian   Japanese   Portuguese   Spanish  
Articles 02 Apr 2013
Nested virtualization for the next-generation cloud
Cloud computing is a reality that has changed both business and development models for online businesses. But current Infrastructure as a Service (IaaS) cloud models place a significant constraint on software developers-namely, dictating the hypervisor they must use. The requirement on virtual machine images removes choice from cloud users and to some is an obstacle for moving to the cloud. But a breakthrough from IBM research will change this model. The technology, called nested virtualization, permits a deeper solution stack where guest virtual machines sit on a guest hypervisor, which in turn runs on the cloud's chosen hypervisor. In this article, we explore the ideas behind nested virtualization and how it will improve the cloud.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 22 Aug 2012
Understand Representational State Transfer (REST) in Ruby
REST, or Representational State Transfer, is a distributed communication architecture that is quickly becoming the lingua franca for clouds. It's simple, yet expressive enough to represent the plethora of cloud resources and overall configuration and management. Learn how to develop a simple REST agent from the ground up in Ruby to learn its implementation and use.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 17 Aug 2012
Cloud computing and storage with OpenStack
The Infrastructure as a Service (IaaS) cloud platform space is quite diverse, with well-known solutions like Nebula and Eucalyptus. But a relative newcomer to this space has shown considerable growth, not only in users but a large number of supporting companies. Get to know the open source platform OpenStack, and discover whether it's really the open source cloud operating system.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 06 Aug 2012
Introducing the 3.3 and 3.4 Linux kernels
In March 2012, version 3.3 of the Linux kernel was released (followed in by version 3.4 in May). In addition to a plethora of small features and bug fixes, several important changes have arrived with these releases, including the merging of the Google Android project; merging of the Open vSwitch; several networking improvements (including the teaming network device); and a variety of file system, memory management, and virtualization updates. Explore many of the important changes in versions 3.3 and 3.4, and have a peek at what's ahead in 3.5.
Also available in: Chinese   Russian   Japanese  
Articles 19 Jun 2012
Anatomy of an open source cloud
Cloud computing is no longer a technology on the cusp of breaking out, but a valuable and important technology that is fundamentally changing the way we use and develop on-demand applications. As you would expect, Linux and open source provide the foundation for the cloud (for both public and private infrastructures). Explore the anatomy of the cloud, its architecture, and the open source technologies used to build these dynamic and scalable computing and storage platforms.
Also available in: Chinese   Russian   Japanese   Portuguese   Spanish  
Articles 05 Jun 2012
Practice: Process logs with Apache Hadoop
Logs are an essential part of any computing system, supporting capabilities from audits to error management. As logs grow and the number of log sources increases (such as in cloud environments), a scalable system is necessary to efficiently process logs. This practice session explores processing logs with Apache Hadoop from a typical Linux system.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 30 May 2012
Optimizing resource management in supercomputers with SLURM
The arms race of supercomputers is fascinating to watch as their evolving architectures squeeze out more and more performance. One interesting fact about supercomputers is that they all run a version of Linux. To yield the greatest amount of power from an architecture, the SLURM open source job scheduler (used by the Chinese Tianhe-IA supercomputer, and the upcoming IBM Sequoia supercomputer) optimizes resource allocation and monitoring. Learn about SLURM and its approach to parallelizing workloads in clusters.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 22 May 2012
Look at Linux, the operating system and universal platform
Linux is everywhere. If you peer into the smallest smart phone, to the virtual backbone of the Internet, or the largest and most powerful supercomputer, you'll find Linux. That's no simple feat given the range of capabilities expected from these platforms. Discover the omnipresence of Linux and how it supports devices large and small as well as everything in between.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 13 Mar 2012
Process your data with Apache Pig
Apache Pig is a high-level procedural language for querying large semi-structured data sets using Hadoop and the MapReduce Platform. Pig simplifies the use of Hadoop by allowing SQL-like queries to a distributed dataset. Explore the language behind Pig and discover its use in a simple Hadoop cluster.
Also available in: Chinese   Russian   Japanese   Portuguese   Spanish  
Articles 28 Feb 2012
Data analysis and performance with Spark
Spark is an interesting alternative to Hadoop, with a focus on in-memory data processing. This practice session explores multithread and multinode performance with Scala, Spark, and its tunable parameters.
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 14 Feb 2012
Evolution of shells in Linux
Pointing and clicking is fine for most day-to-day computing tasks, but to really take advantage of the strengths of Linux over other environments, you eventually need to crack the shell and enter the command line. Lots of command shells are available, from Bash and Korn to C shell and various exotic and strange shells. Learn which shell is right for you. [Note: Minor corrections were made to Listings 2 and 3.]
Also available in: Chinese   Russian   Japanese   Portuguese  
Articles 09 Dec 2011
Spark, an alternative for fast data analytics
Although Hadoop captures the most attention for distributed data analytics, there are alternatives that provide some interesting advantages to the typical Hadoop platform. Spark is a scalable data analytics platform that incorporates primitives for in-memory computing and therefore exercises some performance advantages over Hadoop's cluster storage approach. Spark is implemented in and exploits the Scala language, which provides a unique environment for data processing. Get to know the Spark approach for cluster computing and its differences from Hadoop.
Also available in: Chinese   Russian   Japanese  
Articles 01 Nov 2011
Data mining with Ruby and Twitter
Twitter is not only a fantastic real-time social networking tool, it's also a source of rich information that's ripe for data mining. On average, Twitter users generate 140 million tweets per day on a variety of topics. This article introduces you to data mining and demonstrates the concept with the object-oriented Ruby language.
Also available in: Chinese   Russian   Japanese   Portuguese   Spanish  
Articles 11 Oct 2011
Open source physics engines
A physics engine is a software component that provides a simulation of a physical system. This simulation can include soft- and rigid-body dynamics, fluid dynamics, and collision detection. This article introduces the use and basics of a physics engine and explores two options that exist: Box2D and Bullet.
Articles 07 Jul 2011
Ceylon: True advance, or just another language?
The language road in computer science is littered with the carcasses of what was to be "the next big thing." And although many niche languages do find some adoption in scripting or specialized applications, C (and its derivatives) and the Java language are difficult to displace. But Red Hat's Ceylon appears to be an interesting combination of language features, using a well-known C-style syntax but with support for object orientation and useful functional aspects in addition to an emphasis on being succinct. Explore Ceylon and find out if this future VM language can find a place in enterprise software development. [Update: The fail block is clarified in Listing 7. -Ed.]
Also available in: Chinese   Russian   Japanese   Spanish  
Articles 07 Jul 2011
Application virtualization, past and future
When you hear the phrase "virtual machine" today, you probably think of virtualization and hypervisors. But VMs are simply an older concept of abstraction, a common method of abstracting one entity from another. This article explores two of the many newer open source VM technologies: Dalvik (the VM core of the Android operating system) and Parrot (an open source VM technology for efficiently executing dynamic languages).
Also available in: Russian   Japanese   Portuguese   Spanish  
Articles 03 May 2011
Virtualization for embedded systems
Today's technical news is filled with stories of server and desktop virtualization, but there's another virtualization technology that's growing rapidly: embedded virtualization. The embedded domain has several useful applications for virtualization, including mobile handsets, security kernels, and concurrent embedded operating systems. This article explores the area of embedded virtualization and explains why it's coming to an embedded system near you.
Also available in: Russian   Japanese   Portuguese  
Articles 19 Apr 2011
Linux and the storage ecosystem
Linux is the Swiss Army knife of file systems, and it also offers a wide variety of storage technologies for both desktops and servers. Beyond the file system, Linux incorporates world-class NAS and SAN technologies, data protection, storage management, support for clouds, and solid-state storage. Learn more about the Linux storage ecosystem and why it's number one in server market share.
Also available in: Russian   Japanese   Portuguese  
Articles 29 Mar 2011
Emulation and computing history
Everything we have today is derived from older computing systems, many of which no longer have functioning hardware you can use. Learn how the Computer History Simulation Project brings this hardware (and operating systems and applications) back to life so they can be enjoyed by a new generation.
Also available in: Russian   Japanese   Portuguese  
Articles 22 Mar 2011
Linux Scheduler simulation
Scheduling is one of the most complex--and interesting--aspects of the Linux kernel. Developing schedulers that provide suitable behavior for single-core machines to quad-core servers can be difficult. Luckily, the Linux Scheduler Simulator (LinSched) hosts your Linux scheduler in user space (for scheduler prototyping) while modeling arbitrary hardware targets to validate your scheduler across a spectrum of topologies. Learn about LinSched and how to experiment with your scheduler for Linux.
Also available in: Russian   Japanese   Portuguese  
Articles 23 Feb 2011
Data visualization with Processing, Part 3: 2-D, 3-D, physics, and networking
This final article in the "Data visualization with Processing" series explores some of Processing's more advanced features, starting with an introduction to 2D and 3D graphics and lighting features. Then, explore physics applications with graphical visualization, learn about Processing's networking features, and develop a simple application that visualizes data from the Internet.
Also available in: Japanese   Portuguese  
Articles 22 Feb 2011
Platform emulation with Bochs
Bochs, like QEMU, is a portable emulator that provides a virtualization environment in which to run an operating system using an emulated platform in the context of another operating system. Bochs isn't a hypervisor but rather a PC-compatible emulator useful for legacy software. Learn about platform emulation using Bochs and its approach to hardware emulation.
Also available in: Japanese  
Articles 25 Jan 2011
Run ZFS on Linux
Although ZFS exists in an operating system whose future is at risk, it is easily one of the most advanced, feature-rich file systems in existence. It incorporates variable block sizes, compression, encryption, de-duplication, snapshots, clones, and (as the name implies) support for massive capacities. Get to know the concepts behind ZFS and learn how you can use ZFS today on Linux using Filesystem in Userspace (FUSE).
Articles 19 Jan 2011
Data visualization with Processing, Part 2: Intermediate data visualization using interfaces, objects, images, and applications
Part 1 of this "Data visualization with Processing" series introduces the Processing language and development environment and demonstrated the language's basic graphical capabilities. This second article explores Processing's more advanced features, including UIs and object-oriented programming. Learn about image processing and how to convert your Processing application into a Java applet suitable for the web, and explore an optimization algorithm that lends itself well to visualization.
Also available in: Japanese   Portuguese  
Articles 11 Jan 2011
Data visualization with Processing, Part 1: An introduction to the language and environment
Also available in: Japanese  
Articles 30 Nov 2010
Network file systems and Linux
Network File System (NFS) has been around since 1984, but it continues to evolve and provide the basis for distributed file systems. Today, NFS (through the pNFS extension) provides scalable access to files distributed across a network. Explore the ideas behind distributed file systems and in particular, recent advances in NFS.
Also available in: Russian   Japanese   Portuguese  
Articles 10 Nov 2010
Virtual networking in Linux
With the explosive growth of platform virtualization, it's not surprising that other parts of the enterprise ecosystem are being virtualized, as well. One of the more recent areas is virtual networking. Early implementations of platform virtualization created virtual NICs, but today, larger portions of the network are being virtualized, such as switches that support communication among VMs on a server or distributed among servers. Explore the ideas behind virtual networking, with a focus on NIC and switch virtualization.
Also available in: Russian   Japanese   Portuguese  
Articles 27 Oct 2010
Kernel logging: APIs and implementation
In kernel development, we useprintk for logging without much thought. But have you considered the process and underlying implementation of kernel logging? Explore the entire process of kernel logging, from printk to insertion into the user space log file.
Also available in: Japanese   Portuguese  
Articles 30 Sep 2010
User space memory access from the Linux kernel
As the kernel and user space exist in different virtual address spaces, there are special considerations for moving data between them. Explore the ideas behind virtual address spaces and the kernel APIs for data movement to and from user space, and learn some of the other mapping techniques used to map memory.
Also available in: Russian   Japanese   Portuguese  
Articles 11 Aug 2010
High availability with the Distributed Replicated Block Device
The 2.6.33 Linux kernel has introduced a useful new service called the Distributed Replicated Block Device (DRBD). This service mirrors an entire block device to another networked host during run time, permitting the development of high-availability clusters for block data. Explore the ideas behind the DRBD and its implementation in the Linux kernel.
Also available in: Japanese  
Articles 04 Aug 2010
Distributed data processing with Hadoop, Part 3: Application development
With configuration, installation, and the use of Hadoop in single- and multinode architectures under your belt, you can now turn to the task of developing applications within the Hadoop infrastructure. This final article in the series explores the Hadoop APIs and data flow and demonstrates their use with a simple mapper and reducer application.
Also available in: Russian   Japanese   Portuguese  
Articles 14 Jul 2010
Ceph: A Linux petabyte-scale distributed file system
Linux continues to invade the scalable computing space and, in particular, the scalable storage space. A recent addition to Linux's impressive selection of file systems is Ceph, a distributed file system that incorporates replication and fault tolerance while maintaining POSIX compatibility. Explore the architecture of Ceph and learn how it provides fault tolerance and simplifies the management of massive amounts of data.
Also available in: Japanese   Portuguese  
Articles 04 Jun 2010
Distributed data processing with Hadoop, Part 2: Going further
The first article in this series showed how to use Hadoop in a single-node cluster. This article continues with a more advanced setup that uses multiple nodes for parallel processing. It demonstrates the various node types required for multinode clusters and explores MapReduce functionality in a parallel environment. This article also digs into the management aspects of Hadoop -- both command line and Web based.
Also available in: Russian   Japanese   Portuguese  
Articles 03 Jun 2010
Distributed data processing with Hadoop, Part 1: Getting started
This article -- the first in a series on Hadoop -- explores the Hadoop framework, including its fundamental elements, such as the Hadoop file system (HDFS), and node types that are commonly used. Learn how to install and configure a single-node Hadoop cluster, and delve into the MapReduce application. Finally, discover ways to monitor and manage Hadoop using its core Web interfaces.
Also available in: Russian   Japanese   Portuguese  
Articles 18 May 2010
Anatomy of Linux Kernel Shared Memory
Linux as a hypervisor includes a number of innovations, and one of the more interesting changes in the 2.6.32 kernel is Kernel Shared Memory (KSM). KSM allows the hypervisor to increase the number of concurrent virtual machines by consolidating identical memory pages. Explore the ideas behind KSM (such as storage de-duplication), its implementation, and how you manage it.
Also available in: Japanese   Portuguese  
Articles 07 Apr 2010
1 - 48 of 48 results
Show Summaries | Hide Summaries