Recently, I restarted to use GLSL for my graphics applications.
I found one thing that I should not forget about an error of using glGetUniformLocationARB. That is not to try to get attribute locations for variables with no reference since they are discarded. I believe that remembering this will save at least one hour of your time.
Wednesday, October 12, 2011
Sunday, October 2, 2011
OpenCL resources
OpenCL is a standard specification for parallel program using processors (e.g., GPUs and CPUs)
OpenCL SDKs
Videos
Saturday, October 1, 2011
Performance metrics
1. Sources of overhead in parallel programming
interprocess interaction (e.g., communication), idling (load imbalancing and synchronization)
2. Execution time and total parallel overhead
- serial runtime (Ts): the time elapsed between the beginning and the end of its execution.
- parallel runtime (Tp): the time elapsed from the moment a parallel computation starts to the moment the last processing element finishes execution.
- overhead function/total overhead (To) = p * Tp - Ts
(p: the number of processing elements)
Speedup is a measure for the relative performance benefit from a parallel algorithm compared to a sequential algorithm. It is defined as
S = Ts / Tp
Amdahl's law, which is used to find the theoretical maximum speedup, defines the speedup as
S = 1 / ((1-f) + f / p),
where f is the portion of the execution time perfectly parallelizable and p is the number of processing
elements
4. Efficiency (E)
E = S / p = Ts/ (p*Tp)
References
- Grama et al. Introduction to Parallel computing
- Amdahl's Law in the Multicore Era at Google tech talk in 2009
- Amdahl's law in Wikipedia
Wednesday, September 28, 2011
Unix shell commands
- delete subdirectories with specific names
find . -type d | grep .svn | xargs rm -r
Tuesday, September 20, 2011
NVIDIA Tegra
NVIDIA Tegra is a system-on-a-chip (SoC) series for mobile computing devices. It is a heterogeneous processor architecture with multiple processors: CPU, a HD video processor, an imaging processor, an audio processor, and an ultra-low power GeForce GPU(See the article). For detail on Tegra architecture, see the whitepaper on NVIDIA® Tegra™ Multi-processor Architecture
Tegra2 series
It has a Dual-core ARM Cortex-A9 CPU (lacking ARMs advanced SIMD extension), ULP GeForce GPU with 4 pixel shaders + 4 vertex shaders (OpenGL ES 2.0), a single-channel memory controller (See the link), and 1080p video playback processor.
Project Kal-El
It has a quad-core CPU and a twelve-core GPU. It is five times faster than the Tegra2.
While searching information on the project, I found that an NVIDIA blog post states that it will have the fifth core. See the post.
Tegra Roadmap
The figure below shows NVIDIA's roadmap for its chipsets. The Stark chipset will be 75 times faster than the Tegra2. It's amazing.
Tegra development resources
NVIDIA provides all the resource for Tegra developers at http://developer.nvidia.com/node/19096
Tegra2 series
It has a Dual-core ARM Cortex-A9 CPU (lacking ARMs advanced SIMD extension), ULP GeForce GPU with 4 pixel shaders + 4 vertex shaders (OpenGL ES 2.0), a single-channel memory controller (See the link), and 1080p video playback processor.
Project Kal-El
It has a quad-core CPU and a twelve-core GPU. It is five times faster than the Tegra2.
While searching information on the project, I found that an NVIDIA blog post states that it will have the fifth core. See the post.
NVIDIA Project Kal-El Demo: Glowball |
Tegra Roadmap
The figure below shows NVIDIA's roadmap for its chipsets. The Stark chipset will be 75 times faster than the Tegra2. It's amazing.
![]() |
| Roadmap for Tegra series |
Tegra development resources
NVIDIA provides all the resource for Tegra developers at http://developer.nvidia.com/node/19096
Friday, September 16, 2011
CMake
CMake is an open-source cross platform build system.
What you need to do is creating a CMakeLists.txt file which includes all your project configuration.
1. Installation
2. How to create a CMakeLists.txt to build your program
Let me show with a simple example of CMakeLists.txt.
3. How to build your program with CMakeLists.txt:
4. How to build your CUDA program with CMakeLists.txt:
In your CMakeLists.txt, add command lines below. Note that the minimum version of CMake is 2.8.
The CMakeLists.txt is downloadable at http://vrinside.net/myfiles/simplecuda_cmake.zip
5. How to build your OpenMP program:
Please add the command lines below in your CMakeLists.txt
6. How to create an install option for your Makefile
9. Links
What you need to do is creating a CMakeLists.txt file which includes all your project configuration.
1. Installation
- Windows: You can download an installation package at http://www.cmake.org/cmake/resourcesEdit HTML/software.html
- Linux: Type the command line below
sudo apt-get install build-essential cmake
2. How to create a CMakeLists.txt to build your program
Let me show with a simple example of CMakeLists.txt.
# Specify the minimum version of CMake
cmake_minimum_required(VERSION 2.6)
# Define the name of your project. If possible, the project name
# should be different from the name of the executable file in case
# that you use eclipse as your IDE tool.
PROJECT( helloWorldPrj )
# List up all the source code files using SET keyword. Here, we
# have a file, main.cpp, for this project.
SET (
SRCS
main.cpp
)
# List up all your include directories
INCLUDE_DIRECTORIES (
${CMAKE_SOURCE_DIR}
)
# List up all your library directories
# As an example, your source code directory is specified.
LINK_DIRECTORIES (
${CMAKE_SOURCE_DIR}
)
# Specify all your source files with your executable name.
# Since SRCS is defined above, ${SRCS} is included here.
ADD_EXECUTABLE(helloWorld
${SRCS}
)
# List up all the libraries that are needed for the link
TARGET_LINK_LIBRARIES(helloWorld
# List your libraries if you have any
)
3. How to build your program with CMakeLists.txt:
- cmake
- cmake-gui
- qtcreator
- Eclipse
4. How to build your CUDA program with CMakeLists.txt:
In your CMakeLists.txt, add command lines below. Note that the minimum version of CMake is 2.8.
The CMakeLists.txt is downloadable at http://vrinside.net/myfiles/simplecuda_cmake.zip
# finding the CUDA path
FIND_PACKAGE(CUDA)
# Add the CUDA header file and library directories and CUDA libraries
INCLUDE(FindCUDA)
# The command line below is to create a file group for Visual
# Studio users
IF(WIN32)
SOURCE_GROUP("CUDA Files" FILES ${CUDA_SRC})
ENDIF(WIN32)
# Replace ADD_EXECUTABLE with CUDA_ADD_EXECUTABLE
CUDA_ADD_EXECUTABLE(${TARGET_NAME}
${SRCS}
${CUDA_SRCS}
)
5. How to build your OpenMP program:
Please add the command lines below in your CMakeLists.txt
SET(CMAKE_CXX_FLAGS “${CMAKE_CXX_FLAGS} -fopenmp”)
SET(CMAKE_C_FLAGS “${CMAKE_C_FLAGS} -fopenmp”)
6. How to create an install option for your Makefile
e.g.)
INSTALL_FILES (/$ENV{VR_ROOT_PATH}/include/vrio FILES ${IO_HEADER})
7. How to test environment variables in IF command
SET (MYVAR1 $ENV{TACC_CUDA_DIR})
IF (MYVAR1)
...
ENDIF (MYVAR1)
8. How to execute command and assign the result to a variableEXECUTE_PROCESS(COMMAND echo $ENV{TACC_CUDA_DIR} OUTPUT_VARIABLE MYVAL2)
9. Links
http://vtk.org/Wiki/CMake
How to start programming with CUDA
I used to help undergraduate students as a mentor for the GPGPU team, one of the VIP (Vertically Integrated Project) project teams. Since most of students who joined the team had not done GPU programming, I guided them like below:
- Read chapter 2 and chapter 3 of the CUDA programming guide to understand the CUDA programming model execution models and programming interfaces. Now a days, there many videos for introduction to CUDA programming. (See the useful links below)
- Improve your understanding on GPGPU programing through hands-on experience
- Write a program to increase all elements of an input array by 1 for their understanding about the execution model and programming model
- Modify an example of the matrix multiplication to support non-power of two sized matrices for their understanding about the memory model (global, shared, and lobal) and the execution model
- Learn about the prefix-sum since it is commonly used for the data parallel programming (e.g., reduction): 1, 2.
- Start class projects since they are ready for it.
Thursday, September 15, 2011
Volume datasets
Below is a list of web links where you can download volume datasets.
- Hurricane Isabel data
It is IEEE Visualization 2004 Contest data - Volume Library
- VolVis Datasets
- Turbulent Combustion Simulation
MapReduce
MapReduce is a programming model for processing and generating large datasets. It was first proposed by Google. There are several implementations available: Hadoop MapReduce and Phoenix.
Some work tried to use the GPUs (Mars) and both the CPUs and the GPUs to run their MapReduce implementation.
Some work tried to use the GPUs (Mars) and both the CPUs and the GPUs to run their MapReduce implementation.
- Mars: A MapReduce Framework on Graphics Processors
The source code is available at http://www.cse.ust.hk/gpuqp/Mars.html - Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
The presentation slides are downloadable here.
Data formats
Below are most of data formats that I have dealt with for scientific visualization.
- VTK [Wikipedia]
The Visualization Toolkit(VTK) is an open-source software system for 3D computer graphics and visualization.
- OpenDX [Wikipedia]
It is an open-source visualization software package based on IBM's visualization data explorer.
- NetCDF (Network Common Data Form) [Wikipedia]
According the Wikipedia NetCDF page, it is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. - Silo [Wikipedia]
Silo is a computer data format and library for storing rectilinear, curvilinear, unstructured, or point meshes in 2D and 3D. It is on top of other low-level storage libraries such as PDB, NetCDF, HDF5. Silo example codes are available at http://www.e-science.le.ac.uk/format/silocode.shtml.
Wednesday, September 14, 2011
My test program using Qualcomm Augmented Reality “QCAR” SDK
When I create a software component, I try to make it as much OS independent as possible. I have developed a 3D graphics rendering engine for my research projects. I have made it compiled using Android NDK for fun last year. Below is a screenshot of my program. You can see a video of it by clicking the image.
The 3D model is one of short performances with the character called "Floops" created by SGI.
Liferay 6.0: Open source solutions for web portals and collaboration.
My advisor wanted me to try Liferay, an open source web portal solution, to manage all projects of VACCINE. I have finished the initial setup of it bundled with Tomcat with one of my lab friend. Now, it is available at http://pixel.ecn.purdue.edu:8090/. From our checking up the functionality, we found that it is very useful to share information among project members and manage projects. It provides content management through the use of the web content display and the document library and supports collaboration through the use of the message board, blog, wiki and calendar.
Introduction to Machine learning
This semester, I am auditing a class in machine learning since I thought that machine learning could be applicable in many different domains while learning about some algorithms in a computer vision class.
Below are the useful links that I usually refer to.
Below are the useful links that I usually refer to.
- Class web site (CS590): http://learning.stat.purdue.edu/wiki/courses/fall2011/cs590/start
- Reference classe websites:
Subscribe to:
Posts (Atom)

