September 2010
M T W T F S S
« Aug    
 12345
6789101112
13141516171819
20212223242526
27282930  

ave

I discovered a new, very useful, R function yesterday: ave.

This is what it does: “Subsets of ‘x[]‘ are averaged, where each subset consist of those observations with the same factor levels.”

But interestingly, you can use any function other than average. The output of that function is set against each observation.

I wanted to, for example, stick [...]

Basic skills to use a spreadsheet software

Two students of social sciences who had so far used computers for limited, specific tasks recently asked me for some kind of a course outline that they should cover to pick up basic skills for using a spreadsheet software. I prepared the list and thought it may interest some others. Hence this post.

1. Basic structure [...]

Heteroscedasticity

If a model is estimated using the following code:
lm(y~x1+x2)->p

1. bptest(p) does the Breuch Pagan test to formally check presence of heteroscedasticity. To use bptest, you will have to call lmtest library.

2. If the test is positive (low p value), you should see if any transformation of the dependent variable helps you eliminate heteroscedasticity. Also check if [...]

Find type of variables in a data frame

sapply(a,class) gives type of field (character, numeric, or factor) for each variable in the data [...]

Moving average/median

?rollmean (package=zoo)?rollmedian [...]

“relevel” factors when using them to create dummy variables

?relevel

: The levels of a factor are re-ordered so that the level specified
by ‘ref’ is first and the others are moved down. This is useful
for ‘contr.treatment’ contrasts which take the first level [...]

Working with ggplot, version 2

In my previous post on this issue, I had presented a code that made weighted boxplots and annotated them with boxplot statistics and the mean values. The problem with that code was that it printed these annotations right on the vertical axes of the boxplots. Also, a relatively minor problem was that, when the values of [...]

Reading large tables into R

Here are some useful tips on [...]

Reading large datasets in R

Farnsworth has discussed with an example a faster way of reading large files. It would be nice if some of you tried to implement it to read schdata.txt

Also, let us collectively mine the documentation/r-help for more resources [...]

Reading data from microsoft access files in linux

Some of the Census 2001 data are in microsoft access files (having filename extensions .mdb). A microsoft access file can have several tables inside, each of which contains data. There is a software called mdbtools that can be used to read access files.

The command mdb-tables can be used to see the names of tables and the [...]