Gingerbread Cookies: 2011

Wednesday, October 12, 2011

GLSL programming - glGetUniformLocationARB

Recently, I restarted to use GLSL for my graphics applications.
I found one thing that I should not forget about an error of using glGetUniformLocationARB. That is not to try to get attribute locations for variables with no reference since they are discarded. I believe that remembering this will save at least one hour of your time.

Sunday, October 2, 2011

OpenCL resources

OpenCL is a standard specification for parallel program using processors (e.g., GPUs and CPUs)

OpenCL SDKs

Videos

Introduction to OpenCL at NVIDIA GTC 2010

Saturday, October 1, 2011

Performance metrics

1. Sources of overhead in parallel programming
interprocess interaction (e.g., communication), idling (load imbalancing and synchronization)

2. Execution time and total parallel overhead

serial runtime (Ts): the time elapsed between the beginning and the end of its execution.
parallel runtime (Tp): the time elapsed from the moment a parallel computation starts to the moment the last processing element finishes execution.
overhead function/total overhead (To) = p * Tp - Ts
(p: the number of processing elements)

3. Speedup (S)
Speedup is a measure for the relative performance benefit from a parallel algorithm compared to a sequential algorithm. It is defined as
S = Ts / Tp

Amdahl's law, which is used to find the theoretical maximum speedup, defines the speedup as
      S = 1 / ((1-f) + f / p),
      where f is the portion of the execution time perfectly parallelizable and p is the number of processing
      elements

4. Efficiency (E)
     E = S / p = Ts/ (p*Tp)

References

Grama et al. Introduction to Parallel computing
Amdahl's Law in the Multicore Era at Google tech talk in 2009
Amdahl's law in Wikipedia

Wednesday, September 28, 2011

Unix shell commands

delete subdirectories with specific names
find . -type d | grep .svn | xargs rm -r

Tuesday, September 20, 2011

NVIDIA Tegra

NVIDIA Tegra is a system-on-a-chip (SoC) series for mobile computing devices. It is a heterogeneous processor architecture with multiple processors: CPU, a HD video processor, an imaging processor, an audio processor, and an ultra-low power GeForce GPU(See the article). For detail on Tegra architecture, see the whitepaper on NVIDIA® Tegra™ Multi-processor Architecture

Tegra2 series
It has a Dual-core ARM Cortex-A9 CPU (lacking ARMs advanced SIMD extension), ULP GeForce GPU with 4 pixel shaders + 4 vertex shaders (OpenGL ES 2.0), a single-channel memory controller (See the link), and 1080p video playback processor.

Project Kal-El
It has a quad-core CPU and a twelve-core GPU. It is five times faster than the Tegra2.
While searching information on the project, I found that an NVIDIA blog post states that it will have the fifth core. See the post.

NVIDIA Project Kal-El Demo: Glowball

Tegra Roadmap
The figure below shows NVIDIA's roadmap for its chipsets. The Stark chipset will be 75 times faster than the Tegra2. It's amazing.

Roadmap for Tegra series

Tegra development resources
NVIDIA provides all the resource for Tegra developers at http://developer.nvidia.com/node/19096

Friday, September 16, 2011

CMake

CMake is an open-source cross platform build system.
What you need to do is creating a CMakeLists.txt file which includes all your project configuration.

1. Installation

Windows: You can download an installation package at http://www.cmake.org/cmake/resources Edit HTML /software.html
Linux: Type the command line below
sudo apt-get install build-essential cmake

2. How to create a CMakeLists.txt to build your program
Let me show with a simple example of CMakeLists.txt.

# Specify the minimum version of CMake
cmake_minimum_required(VERSION 2.6) 

# Define the name of your project. If possible, the project name 
# should be different from the name of the executable file in case 
# that you use eclipse as your IDE tool.
PROJECT( helloWorldPrj ) 

# List up all the source code files using SET keyword. Here, we
# have a file, main.cpp, for this project.
SET (
        SRCS
        main.cpp
)

# List up all your include directories
INCLUDE_DIRECTORIES (
        ${CMAKE_SOURCE_DIR}
)

# List up all your library directories
# As an example, your source code directory is specified.
LINK_DIRECTORIES (
        ${CMAKE_SOURCE_DIR}
)

# Specify all your source files with your executable name.
# Since SRCS is defined above, ${SRCS} is included here.
ADD_EXECUTABLE(helloWorld
        ${SRCS}
)

# List up all the libraries that are needed for the link
TARGET_LINK_LIBRARIES(helloWorld
        # List your libraries if you have any
)

3. How to build your program with CMakeLists.txt:

cmake
cmake-gui
qtcreator
Eclipse

4. How to build your CUDA program with CMakeLists.txt:
In your CMakeLists.txt, add command lines below. Note that the minimum version of CMake is 2.8.
The CMakeLists.txt is downloadable at http://vrinside.net/myfiles/simplecuda_cmake.zip

# finding the CUDA path
FIND_PACKAGE(CUDA)

# Add the CUDA header file and library directories and CUDA libraries
INCLUDE(FindCUDA)

# The command line below is to create a file group for Visual
# Studio users
IF(WIN32)
        SOURCE_GROUP("CUDA Files" FILES ${CUDA_SRC})
ENDIF(WIN32)

# Replace ADD_EXECUTABLE with CUDA_ADD_EXECUTABLE
CUDA_ADD_EXECUTABLE(${TARGET_NAME}
        ${SRCS}
        ${CUDA_SRCS}
)

5. How to build your OpenMP program:
Please add the command lines below in your CMakeLists.txt

SET(CMAKE_CXX_FLAGS “${CMAKE_CXX_FLAGS} -fopenmp”)
SET(CMAKE_C_FLAGS “${CMAKE_C_FLAGS} -fopenmp”)

6. How to create an install option for your Makefile

e.g.)

INSTALL_FILES (/$ENV{VR_ROOT_PATH}/include/vrio FILES ${IO_HEADER})

7. How to test environment variables in IF command

SET (MYVAR1 $ENV{TACC_CUDA_DIR})

IF (MYVAR1)

...

ENDIF (MYVAR1)


8. How to execute command and assign the result to a variable

EXECUTE_PROCESS(COMMAND echo $ENV{TACC_CUDA_DIR} OUTPUT_VARIABLE MYVAL2)

9. Links

http://vtk.org/Wiki/CMake

How to start programming with CUDA

I used to help undergraduate students as a mentor for the GPGPU team, one of the VIP (Vertically Integrated Project) project teams. Since most of students who joined the team had not done GPU programming, I guided them like below:

Read chapter 2 and chapter 3 of the CUDA programming guide to understand the CUDA programming model execution models and programming interfaces. Now a days, there many videos for introduction to CUDA programming. (See the useful links below)
Improve your understanding on GPGPU programing through hands-on experience

Write a program to increase all elements of an input array by 1 for their understanding about the execution model and programming model
Modify an example of the matrix multiplication to support non-power of two sized matrices for their understanding about the memory model (global, shared, and lobal) and the execution model

Learn about the prefix-sum since it is commonly used for the data parallel programming (e.g., reduction): 1, 2.
Start class projects since they are ready for it.

Useful links are below:

Thursday, September 15, 2011

Volume datasets

Below is a list of web links where you can download volume datasets.

Hurricane Isabel data
It is IEEE Visualization 2004 Contest data
Volume Library
VolVis Datasets
Turbulent Combustion Simulation

MapReduce

MapReduce is a programming model for processing and generating large datasets. It was first proposed by Google. There are several implementations available: Hadoop MapReduce and Phoenix.

Some work tried to use the GPUs (Mars) and both the CPUs and the GPUs to run their MapReduce implementation.

Mars: A MapReduce Framework on Graphics Processors
The source code is available at http://www.cse.ust.hk/gpuqp/Mars.html
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
The presentation slides are downloadable here.

Data formats

Below are most of data formats that I have dealt with for scientific visualization.

VTK [Wikipedia]
The Visualization Toolkit(VTK) is an open-source software system for 3D computer graphics and visualization.
OpenDX [Wikipedia]
It is an open-source visualization software package based on IBM's visualization data explorer.
NetCDF (Network Common Data Form) [Wikipedia]
According the Wikipedia NetCDF page, it is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

Reading NetCDF files in MATLAB and Analyzing the Precipitation Data

Silo [Wikipedia]
Silo is a computer data format and library for storing rectilinear, curvilinear, unstructured, or point meshes in 2D and 3D. It is on top of other low-level storage libraries such as PDB, NetCDF, HDF5. Silo example codes are available at http://www.e-science.le.ac.uk/format/silocode.shtml.

Wednesday, September 14, 2011

My test program using Qualcomm Augmented Reality “QCAR” SDK

When I create a software component, I try to make it as much OS independent as possible. I have developed a 3D graphics rendering engine for my research projects. I have made it compiled using Android NDK for fun last year. Below is a screenshot of my program. You can see a video of it by clicking the image.

The 3D model is one of short performances with the character called "Floops" created by SGI.

Liferay 6.0: Open source solutions for web portals and collaboration.

My advisor wanted me to try Liferay, an open source web portal solution, to manage all projects of VACCINE. I have finished the initial setup of it bundled with Tomcat with one of my lab friend. Now, it is available at http://pixel.ecn.purdue.edu:8090/. From our checking up the functionality, we found that it is very useful to share information among project members and manage projects. It provides content management through the use of the web content display and the document library and supports collaboration through the use of the message board, blog, wiki and calendar.

Introduction to Machine learning

This semester, I am auditing a class in machine learning since I thought that machine learning could be applicable in many different domains while learning about some algorithms in a computer vision class.

Below are the useful links that I usually refer to.

Class web site (CS590): http://learning.stat.purdue.edu/wiki/courses/fall2011/cs590/start
Reference classe websites:

Machine learning at Standford: 1, 2

Pages