A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

Making Data Meaningful

United Nations Economic Commission for Europe has published these two documents very interesting documents.

Making Data Meaningful, Part I

Making Data Meaningful, Part II



I am organising a group at CESP, JNU that will produce scripts using open source tools for reading NSSO and ASI data. The idea is to release these scripts under an open source license. Any researcher who purchases NSSO/ASI data will be able to use these scripts to process them.

We will use these scripts […]


I discovered a new, very useful, R function yesterday: ave.

This is what it does: “Subsets of ‘x[]’ are averaged, where each subset consist of those observations with the same factor levels.”

But interestingly, you can use any function other than average. The output of that function is set against each observation.

I wanted […]

Basic skills to use a spreadsheet software

Two students of social sciences who had so far used computers for limited, specific tasks recently asked me for some kind of a course outline that they should cover to pick up basic skills for using a spreadsheet software. I prepared the list and thought it may interest some others. Hence this post.

1. […]

Moving average/median

?rollmean (package=zoo)?rollmedian (package=zoo)?runmed (package=stats)

“relevel” factors when using them to create dummy variables


: The levels of a factor are re-ordered so that the level specified by ‘ref’ is first and the others are moved down. This is useful for ‘contr.treatment’ contrasts which take the first level as the reference.

Reading large tables into R

Here are some useful tips on the issue.


Reading data from microsoft access files in linux

Some of the Census 2001 data are in microsoft access files (having filename extensions .mdb). A microsoft access file can have several tables inside, each of which contains data. There is a software called mdbtools that can be used to read access files.

The command mdb-tables can be used to see the names of tables […]

Dropping columns in subset command

Use “select=c(var1,var2)” in the subset command to select var1 and var2.

Use “select=-c(var1,var2)” in the subset command to drop var1 and var2.

Technorati Tags: GNU-R

Renaming variables in a dataframe

There is no direct command in R for renaming variables and that may make it less than obvious for some people. Of course, once you know, it is simple. The following command does the trick.





Technorati: GNU-R

Use reshape to tabulate

Package reshape is meant for aggregating, reshaping and tabulating data.Tabulation is done in two steps: melt and cast. Read help for these functions.


This will create a dataframe sl2 which will have all the variablesin sl1 and “foo” being reorganised for casting later. See head(sl2) to see the form it takes.

List of variables in a dataframe

Function “names” gives list of columns/variables in a dataframe.





Technorati: R

Replacing selected values of variables

You can replace selected values of variables in a dataframe (or any other r object) by using the function replace. The documentation on the function is straightforward and comprehensive.

More on “A Little Trick in Reading Data”

Here is an example of the beauty of R.

To split up a character string, all you need is to use a function called substr (for sub string)!! So you don’t really need to write the variable into a new file and read it back with read.fwf as I did (see my earlier post titled […]


The following text from the R help pages clearly explains the use of command subset in extracting data from a data.frame. This command will be of much use in preparation of data set for further analysis.



subset {base} R Documentation

Subsetting Vectors and Data Frames


Return subsets of vectors or data frames […]