Sas and sql

Solo disponible en BuenasTareas
  • Páginas : 12 (2841 palabras )
  • Descarga(s) : 0
  • Publicado : 27 de noviembre de 2011
Leer documento completo
Vista previa del texto
SUGI 29

Tutorials

Paper 268-29

Introduction to Proc SQL
Katie Minten Ronk, Systems Seminar Consultants, Madison, WI
ABSTRACT
PROC SQL is a powerful Base SAS Procedure that combines the functionality of DATA and PROC steps into a single step. PROC SQL can sort, summarize, subset, join (merge), and concatenate datasets, create new variables, and print the results or create a new tableor view all in one step! PROC SQL can be used to retrieve, update, and report on information from SAS data sets or other database products. This paper will concentrate on SQL’s syntax and how to access information from existing SAS data sets. Some of the topics covered in this brief introduction include: • • • • • Writing SQL code using various styles of the SELECT statement. Dynamically creatingnew variables on the SELECT statement. Using CASE/WHEN clauses for conditionally processing the data. Joining data from two or more data sets (like a MERGE!). Concatenating query results together.

WHY LEARN PROC SQL?
PROC SQL can not only retrieve information without having to learn SAS syntax, but it can often do this with fewer and shorter statements than traditional SAS code. Additionally,SQL often uses fewer resources than conventional DATA and PROC steps. Further, the knowledge learned is transferable to other SQL packages.

AN EXAMPLE OF PROC SQL SYNTAX
Every PROC SQL query must have at least one SELECT statement. The purpose of the SELECT statement is to name the columns that will appear on the report and the order in which they will appear (similar to a VAR statement onPROC PRINT). The FROM clause names the data set from which the information will be extracted from (similar to the SET statement). One advantage of SQL is that new variables can be dynamically created on the SELECT statement, which is a feature we do not normally associate with a SAS Procedure: PROC SQL; SELECT STATE, SALES, (SALES * .05) AS TAX FROM USSALES; QUIT; (no output shown for this code)THE SELECT STATEMENT SYNTAX
The purpose of the SELECT statement is to describe how the report will look. It consists of the SELECT clause and several sub-clauses. The sub-clauses name the input dataset, select rows meeting certain conditions (subsetting), group (or aggregate) the data, and order (or sort) the data: PROC SQL options; SELECT column(s) FROM table-name | view-name WHERE expressionGROUP BY column(s) HAVING expression ORDER BY column(s); QUIT;

A SIMPLE PROC SQL
An asterisk on the SELECT statement will select all columns from the data set. By default a row will wrap when there is too much information to fit across the page. Column headings will be separated from the data with a line and no observation number will appear: PROC SQL;

1

SUGI 29

Tutorials

SELECT *FROM USSALES; QUIT; (see output #1 for results)

A COMPLEX PROC SQL
The SELECT statement in it’s simplest form, needs a SELECT and a FROM clause. The SELECT statement can also have all six possible clauses represented in a query: proc sql; SELECT state, sum(sales) as TOTSALES FROM ussales WHERE state in (’WI’,’MI’,’IL’) GROUP BY state HAVING sum(sales) > 40000 ORDER BY state desc; quit; (seeoutput #2 for results) These statements will be reviewed in detail later in the paper.

LIMITING INFORMATION ON THE SELECT
To specify that only certain variables should appear on the report, the variables are listed and separated on the SELECT statement. The SELECT statement does NOT limit the number of variables read. The NUMBER option will print a column on the report labeled 'ROW' which containsthe observation number: PROC SQL NUMBER; SELECT STATE, SALES FROM USSALES; QUIT; (see output #3 for results)

CREATING NEW VARIABLES
Variables can be dynamically created in PROC SQL. Dynamically created variables can be given a variable name, label, or neither. If a dynamically created variable is not given a name or a label, it will appear on the report as a column with no column heading....
tracking img