The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Data for Stock Market Prediction

wesselwessel Member Posts: 537 Maven
Dear All,

Can someone suggest some data to use for stock market prediction?
You can download daily volume and opening / closing prices for just about any stock from Google finance and Yahoo finance.

Furthermore, with a little more effort you can download news headlines.

What more?
What data is typically used by professionals?

Best regards,



  • Options
    ronmacronmac Member Posts: 11 Contributor II
    I like to start my data analysis process by collecting various financial data series and run the RM Mutual Information Operator on the series. That helps gets me in the right "stadium" with the data sets to use The Correlation Operator is useful also. Then I work up to getting on the "playing filed" using other RM operators. Lately I have been using the Kmeans Cluster Operator to help find some useful predictive patterns. Yahoo Pipes has some great tools for data collecting. Here is an Excel Macro I use to pull Yahoo data directly into Exce. Then I read the workbook into RM. Makes it fairly easy and fast to find useful relationships in your data. Sorry there is no short cut and I don't think anybody that believes they have discovered the "Holy Grail" is going to come forward with the secret recipe
    A very good video I just watched covers this “Markets are Efficient if and only if P = NP”

    Ron McEwan
    Sub GetData()

        Dim QuerySheet As Worksheet
        Dim DataSheet As Worksheet
        Dim EndDate As Date
        Dim StartDate As Date
        Dim Symbol As String
        Dim qurl As String
        Application.ScreenUpdating = False
        Application.DisplayAlerts = False
        Application.Calculation = xlCalculationManual
        Set DataSheet = ActiveSheet
            StartDate = DataSheet.Range("B2").Value
            EndDate = DataSheet.Range("B3").Value
            Symbol = DataSheet.Range("B4").Value
    'construct the URL for the query
            qurl = "" & Symbol
            qurl = qurl & "&a=" & Month(StartDate) - 1 & "&b=" & Day(StartDate) & _
                "&c=" & Year(StartDate) & "&d=" & Month(EndDate) - 1 & "&e=" & _
                Day(EndDate) & "&f=" & Year(EndDate) & "&g=" & Range("C3") & "&q=q&y=0&z=" & _
                Symbol & "&x=.csv"
            Range("c5") = qurl
                With ActiveSheet.QueryTables.Add(Connection:="URL;" & qurl, Destination:=DataSheet.Range("C7"))
                    .BackgroundQuery = True
                    .TablesOnlyFromHTML = False
                    .Refresh BackgroundQuery:=False
                    .SaveData = True
                End With
                Range("C7").CurrentRegion.TextToColumns Destination:=Range("C7"), DataType:=xlDelimited, _
                    TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, _
                    Semicolon:=False, Comma:=True, Space:=False, other:=False
                Range(Range("C7"), Range("C7").End(xlDown)).NumberFormat = "mmm d/yy"
                Range(Range("D7"), Range("G7").End(xlDown)).NumberFormat = "0.00"
                Range(Range("H7"), Range("H7").End(xlDown)).NumberFormat = "0,000"

    'turn calculation back on
        Application.Calculation = xlCalculationAutomatic
        Application.DisplayAlerts = True
        Selection.Sort Key1:=Range("C8"), Order1:=xlAscending, Header:=xlGuess, _
            OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom

    End Sub
  • Options
    wesselwessel Member Posts: 537 Maven

    I'm not looking for some holy grail.

    When searching ScienceDirect there are lots of papers that analyze stock market data,
    using many more variables then just past prices, like:
    Dividends for the index, bid-ask spread of T-bills, T-bill holding period, nominal stock returns of the index, excess returns of the index, various duration treasury rates, various duration CD rates, corporate bond yields, the producer price index, the consumer price index, the industrial production index, M1 money supply, various treasury term spreads, and various default spreads between BAA bonds and various treasuries.

    I want to do similar research, but I have no idea where to download such data.
    Looked at yahoo pipes, although it is cool, I don't think it helps with this problem.

    Best regards,

  • Options
    ronmacronmac Member Posts: 11 Contributor II
    Some of the data you are looking for can be found at the St Louis Federal Reserve site They have a free Excel tool to simplify data retrieval
Sign In or Register to comment.