June 2013
M T W T F S S
« May    
 12
3456789
10111213141516
17181920212223
24252627282930

Hindi/Devanagari presentations using orgmode, R, latex and beamer

I recently had to prepare a beamer presentation in hindi/devanagari. I usually use emacs-orgmode  with a lot of R source code embedded in it to prepare my beamer presentations. To adapt the entire setup to work with devanagari, this is what I needed to do.

    

Make orgmode export to latex using xetex rather than [...]

ave

I discovered a new, very useful, R function yesterday: ave.

This is what it does: “Subsets of ‘x[]‘ are averaged, where each subset consist of those observations with the same factor levels.”

But interestingly, you can use any function other than average. The output of that function is set against each observation.

I wanted [...]

Heteroscedasticity

If a model is estimated using the following code: lm(y~x1+x2)->p

1. bptest(p) does the Breuch Pagan test to formally check presence of heteroscedasticity. To use bptest, you will have to call lmtest library.

2. If the test is positive (low p value), you should see if any transformation of the dependent variable helps you eliminate [...]

Find type of variables in a data frame

sapply(a,class) gives type of field (character, numeric, or factor) for each variable in the data frame a.

Moving average/median

?rollmean (package=zoo)?rollmedian (package=zoo)?runmed (package=stats)

“relevel” factors when using them to create dummy variables

?relevel

: The levels of a factor are re-ordered so that the level specified by ‘ref’ is first and the others are moved down. This is useful for ‘contr.treatment’ contrasts which take the first level as the reference.

Working with ggplot, version 2

In my previous post on this issue, I had presented a code that made weighted boxplots and annotated them with boxplot statistics and the mean values. The problem with that code was that it printed these annotations right on the vertical axes of the boxplots. Also, a relatively minor problem was that, when the values [...]

Reading large tables into R

Here are some useful tips on the issue.

V.

Reading data from microsoft access files in linux

Some of the Census 2001 data are in microsoft access files (having filename extensions .mdb). A microsoft access file can have several tables inside, each of which contains data. There is a software called mdbtools that can be used to read access files.

The command mdb-tables can be used to see the names of tables [...]

simple.scatterplot: Two way distributions

John Verzani’s book has a title page that shows a scatterplot with histograms of x and y variables along the two axes. It is a very powerful way of looking at two distributions. The plot was generated through a function simple.scatterplot. The function is made available as part of the UsingR package, which can be [...]

Graphics (base, grid and lattice) in RNews

R News 2(2) has papers on grid and lattice packages. R News 3(2) has papers on base, grid and gridBase.

Essential stuff for anybody trying to master R graphics

V.

Working with ggplot

Hadley Wickham’s ggplot is a very interesting package. It makes beautiful graphics, integrates well with some of the other packages to allow you to superimpose the plots of various types of estimates on plots of data. In particular, it uses colours very well. The default colour schemes are aesthetically pleasing. It allows a flexible use [...]

Dropping columns in subset command

Use “select=c(var1,var2)” in the subset command to select var1 and var2.

Use “select=-c(var1,var2)” in the subset command to drop var1 and var2.

Technorati Tags: GNU-R

Page orientation problem in converting postscript files to pdf using ps2pdf

A commonly reported problem with ps2pdf is that it does not always guess the page orientation right.

A neat solution is here http://allendowney.com/essays/orientation/

I just edited the gs_statd.ps to define the wide page and added an alias in my bashrc called widepdf to convert files to pdf in the wide format. It works great now!!

[...]

Statistics on Microsoft Excel

Here is an interesting document on problems of using Microsoft Excel for statistical analysis.

http://gcrc.ucsd.edu/biostatistics/Excel.pdf

Technorati Tags: statistics, excel, microsoft