VBA Code - Sampling Tool

February 28, 2025

Why should I use Sampling Tool?

When working with large datasets in Excel, manually extracting samples can be time-consuming and error-prone. The Sampling Tool automates this process, allowing you to efficiently sample data based on user-defined parameters. This VBA script provides both manual and automatic sampling modes, making data extraction seamless and flexible.

Manual vs. Automatic Sampling

  • Manual Mode —Allows the user to specify the number of samples and the interval manually.
  • Automatic Mode —Automatically calculates the interval by dividing the total data size by the sample size.

This flexibility ensures that users can choose the best method based on their dataset and requirements.

Example

Suppose you have a dataset with 1000 rows and need 50 samples: Let’s say —

  • In manual mode, you can set an nth interval of 20, meaning every 20th row will be picked.
  • In automatic mode, the VBA script calculates the interval automatically as 1000 / 50 = 20.

So, what’s the differences—In Manual Mode you get a flexibility to change the interval of picking the samples whereas in Automatic Mode user only need to input the number of samples required and a random number that’s it, here the interval is automatically picked up by the code.

Impact of VBA Code and Functions

This VBA script enhances efficiency in data analysis by:

  • Reducing manual errors in sample selection.
  • Saving time in large dataset processing.
  • Allowing structured and randomized sampling within a given range.
  • Providing instant results with a dedicated output sheet named Sampled_Data.

VBA Code

Here’s the VBA code we are discussing:

Sub Sampling_tool()
    Dim rng As Range
    Dim ws As Worksheet, sampleSheet As Worksheet
    Dim numSamples As Long, interval As Long, startNum As Long
    Dim totalRows As Long
    Dim i As Long, sampleRow As Long
    Dim sampleCounter As Long
    Dim activeWb As Workbook
    Dim headerRow As Range
    Dim mode As String
    Dim modeChoice As VbMsgBoxResult

    ' Use the active workbook
    Set activeWb = ActiveWorkbook

    ' Select the range including headers
    On Error Resume Next
    Set rng = Application.InputBox("Select the range including headers:", Type:=8)
    On Error GoTo 0
    If rng Is Nothing Then Exit Sub ' Exit if no range is selected

    ' Define the header row
    Set headerRow = rng.Rows(1)

    ' Get the total number of rows excluding headers
    totalRows = rng.Rows.Count - 1 ' Subtract 1 for the header row

    ' Get mode selection from user via dialog box
    modeChoice = MsgBox("Choose Sampling Mode: Yes for Automatic, No for Manual", vbYesNoCancel + vbQuestion, "Sampling Mode Selection")

    ' Determine mode based on user selection
    If modeChoice = vbYes Then
        mode = "automatic"
    ElseIf modeChoice = vbNo Then
        mode = "manual"
    Else
        Exit Sub ' Exit if the user cancels
    End If

    ' Get user inputs based on mode
    If mode = "manual" Then
        numSamples = CLng(Application.InputBox("Enter the number of samples:", Type:=1))
        interval = CLng(Application.InputBox("Enter the sampling interval (e.g., every nth row):", Type:=1))
    Else ' Automatic mode
        numSamples = CLng(Application.InputBox("Enter the number of samples:", Type:=1))
        interval = totalRows \ numSamples ' Calculate interval automatically
    End If

    startNum = CLng(Application.InputBox("Enter the random starting row number (within the interval):", Type:=1))

    ' Validate inputs
    If numSamples <= 0 Or interval <= 0 Or startNum <= 0 Then
        MsgBox "Invalid inputs. Please enter positive numbers.", vbExclamation
        Exit Sub
    End If

    ' Ensure the first sample is within range
    If startNum > totalRows Then
        MsgBox "Starting row exceeds the data range!", vbExclamation
        Exit Sub
    End If

    ' Create a new sheet for samples in the current workbook
    Set ws = rng.Worksheet
    On Error Resume Next
    Set sampleSheet = activeWb.Sheets("Sampled_Data") ' Check if the sheet exists
    On Error GoTo 0

    ' If sheet exists, delete it to create a fresh one
    If Not sampleSheet Is Nothing Then
        Application.DisplayAlerts = False
        sampleSheet.Delete
        Application.DisplayAlerts = True
    End If

    ' Add a new sheet
    Set sampleSheet = activeWb.Sheets.Add
    sampleSheet.Name = "Sampled_Data"

    ' Add metadata to the new sheet
    sampleSheet.Cells(1, 1).Value = "Samples Selected — " & numSamples
    sampleSheet.Cells(2, 1).Value = "Population Size — " & totalRows
    sampleSheet.Cells(3, 1).Value = "Sample Interval — " & interval
    sampleSheet.Cells(4, 1).Value = "Initial Row — " & startNum

    ' Copy header to row 6 in the new sheet
    headerRow.Copy Destination:=sampleSheet.Range("B6")

    ' Initialize sample counter
    sampleCounter = 1 ' Start from row 7 (because row 6 is the header)
    i = startNum

    ' Loop to extract samples
    Do While sampleCounter <= numSamples And i <= totalRows
        sampleRow = rng.row + i ' Actual row number in the sheet

        ' Ensure sampleRow does not exceed worksheet limits
        If sampleRow > ws.Rows.Count Then Exit Do

        ws.Rows(sampleRow).Copy Destination:=sampleSheet.Cells(sampleCounter + 6, 1) ' Start at A7
        sampleCounter = sampleCounter + 1
        i = i + interval ' Move to the next sample row
    Loop

    ' Autofit columns in the new sheet
    sampleSheet.Columns.AutoFit

    MsgBox "Sampling complete! Data exported to 'Sampled_Data'", vbInformation
End Sub

How to Use This Code

To use this code in Excel:

  1. Open Excel and press ALT + F11 to open the Visual Basic for Applications (VBA) editor.
  2. In the VBA editor, insert a new module by right-clicking on any of the items in your project and selecting Insert > Module.
  3. Copy and paste the above code into the module window.
  4. Close the VBA editor and return to Excel.
  5. Run the macro:
    • Press ALT + F8, select Sampling_tool, and click Run.
    • Choose between Manual or Automatic mode.
    • Provide the necessary inputs.
  6. Check the newly created Sampled_Data sheet for your sampled dataset.

This tool simplifies the process of sampling large datasets and ensures structured data extraction without manual intervention.