Forums

The forums ran from 2008-2020 and are now closed and viewable here as an archive.

Home Forums Back End How do I represent this data?!

  • This topic is empty.
Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • #151500
    FahimDaDream
    Participant

    Hello all,

    I’m attempting to make a small php webapp (if you can call it that) for a non-profit for free, but I’m a tad bit inexperienced.

    The webapp is essentially laid out as followed:

    The user is presented with a form to fill out with X amount of questions on it. Depending on their answers, they will be presented with another set of questions. Depending on those answers, they will be asked another set, and so on and so forth.

    Similar to a large tree with any number of branches.

    The trouble I’m having is what would be a the best way to store this data or load it from a DB so that I can make it flexible incase more questions and more cases are added later.

    Can I represent a tree data structure in PHP? If so, would I store the questions in a data base or some sort of other XML data structure?

    Any help would be appreciated!!

    #151507
    __
    Participant

    Can I represent a tree data structure in PHP?

    of course.

    $data = array(
        'tree' => array(
            'branch1' => array( 'some','data' )
           ,'branch2' => array( 'more','data' )
        )
       ,'etc.' => array( 'and','so','on' )
    );
    

    If so, would I store the questions in a data base or some sort of other XML data structure?

    That’s a bit more involved. How are you going to use these questions? Will you ever need to retrieve individual questions? Are questions expected to change content and/or positions? be added or removed often (or temporarily)? Are you going to have lots of questions or branches? Is there metadata (e.g., answers or conditions about when to use) associated with individual questions?

    If you answer “no” to all of those, then xml could work just fine. I would actually recommend JSON instead, however – it is much easier and quicker to author, manipulate, and store.

    Otherwise, I would suggest a database. You could try an object-oriented database (e.g., Mongo) if you want to preserve the tree structure. If you wanted to use a relational database (e.g., MySQL), you have to do a bit more work to represent that tree structure.

    #151518
    FahimDaDream
    Participant

    Thank you for replying!

    How are you going to use these questions?

    They will be displayed much like form elements. The user will type in the question.

    Will you ever need to retrieve individual questions?

    Most likely not, but it is a possibility

    Are questions expected to change content and/or positions? be added or removed often (or temporarily)? Not often,

    only during development or during tweaking

    Are you going to have lots of questions or branches? Yes, this is highly probable down the road. Is 100-200 questions too many? Ideally the end user wont be answering that many, but there may be a lot of branches so it could add up

    Is there metadata (e.g., answers or conditions about when to use) associated with individual questions? I think that’s a yes. Basically if the user is asked, for example, 1. What’s your fav kind of car? (A,B,C,D) 2. Do you like to drive them fast? (Y/N).

    Then there may be a set of follow up questions based on whether the user chose, A, B, C, D per mutated with Y/N. And the appropriate question will be displayed on this.

    Storing it in a DB is possible, but I can’t seem to wrap my head around what would be the best way to have it set up so I can retrieve them. Perhaps I’ll have to come up with my own mapping structure to map the questions.

    #151542
    FahimDaDream
    Participant

    Maybe I should elaborate my situation. I’ll provide a more thorough example and essentially I need help figuring out the best structure to represent in a relational database like MySQL.

    The user will first be asked a set of questions with the following possible answers : Q1: How old are you? a)0-18 b)18-25 c)25-55 d)55+

    Q2: Do you smoke? a) Yes b) No

    If the user selects option A for the first one and option A for the 2nd one the user may be asked a set of questions based on that response. If they chose any other combination they may get another SET of questions.

    The issues I’m facing are,
    How do I represent a SET of questions and their answers?

    How do I represent the fact that for a given set, choosing option A for Q1 and option B for Q2 leads to set 3, whereas choosing option B for Q1 and option C for Q2 lead to set 4

    I will never have to go backwards, but I may be able to add more questions to a set that could lead to other sets.

    #151544
    __
    Participant

    EDIT

    I wrote this before I read your post above. I think it still applies to your questions, though.

    , – – – – – – – – – – – – – – – – – – – –

    You’ll probably need to do a lot of figuring about how the followup questions are determined. It might get fairly complex if, say, question X depends on the answers of both questions A and B, etc..

    Maybe start out with something like this (MySQL):

    CREATE TABLE question(
        id SERIAL PRIMARY KEY  -- primary key
       ,q VARCHAR(255)         -- text of question
    )ENGINE=InnoDB DEFAULT CHARSET=UTF8;
    
    CREATE TABLE question_opt(
        id SERIAL PRIMARY KEY          -- primary key
       ,q_id BIGINT UNSIGNED NOT NULL  -- foreign key for question
       ,o VARCHAR(255)                 -- text of option
       ,FOREIGN KEY( q_id ) REFERENCES question( id )
    )ENGINE=InnoDB DEFAULT CHARSET=UTF8;
    
    CREATE TABLE question_map(
        o_id BIGINT UNSIGNED NOT NULL  -- foreign key for option
       ,q_id BIGINT UNSIGNED NOT NULL  -- foreign key for question
       ,PRIMARY KEY( o_id,q_id )       -- primary key
       ,FOREIGN KEY( o_id ) REFERENCES question_opt( id )
       ,FOREIGN KEY( q_id ) REFERENCES question( id )
       ,INDEX( q_id )
    )ENGINE=InnoDB DEFAULT CHARSET=UTF8;
    

    Basically, we store questions in one table, options in another, and then we have a third table that ties certain questions to the answers of other questions. As a simple example:

    question
    +---+-----------------------------+
    |id | q                           |
    +===+=============================+
    | 1 | What is your favorite color |
    +---+-----------------------------+
    | 2 | Who is your favorite Beatle |
    +---+-----------------------------+
    | 3 | Do you also like green      |
    +---+-----------------------------+
    | 4 | Is it Yoko's fault          |
    +---+-----------------------------+
    
    option
    +---+------+--------+
    |id | q_id | o      |
    +===+======+========+
    | 1 | 1    | red    |
    +---+------+--------+
    | 2 | 1    | blue   |
    +---+------+--------+
    | 3 | 1    | yellow |
    +---+------+--------+
    | 4 | 2    | John   |
    +---+------+--------+
    | 5 | 2    | Paul   |
    +---+------+--------+
    | 6 | 2    | George |
    +---+------+--------+
    | 7 | 2    | Ringo  |
    +---+------+--------+
    | 8 | etc ...
    
    question_map
    +------+------+
    | o_id | q_id |
    +======+======+
    | 2    | 3    |
    +------+------+
    | 4    | 4    |
    +------+------+
    

    So, to start, you could select all questions that are not associated with any particular answer:

    SELECT id,q 
      FROM question
      WHERE id NOT IN(
        SELECT q_id FROM question_map 
      )
    ;
    

    You’d get:

    1 What is your favorite color

    2 Who is your favorite Beatle

    Next, select the options for those questions (using the id from each row in the previous result):

    SELECT id,o 
      FROM question_opt
      WHERE q_id IN(
        1,2
      )
    ;
    

    You’d get options 1 through 7, and you could assign them to the questions using the question’s id.

    Say the user chooses option 1 (“red”) for the first question, and option 4 (“John”) for the second. You’d use the option ids to look for more questions:

    SELECT q.id,q.q
      FROM question q
      JOIN question_map m
      ON q.id = m.q_id
      WHERE m.q_id IN(
        1,4
      )
    ;
    

    Because there is a question associated with option 4, you’d then ask the user:

    4 Was it Yoko’s fault

    #151545
    FahimDaDream
    Participant

    For the database layout I did have a similar layout. But yes, the difficulty will come from the fact that depending on the answer to Question A and Question B is where I choose my next SET of questions.

    I was thinking I store the questions, their options, and the set they belong to in a database. Then I use either JSON or some other data type to quickly reference which is the next set of questions I should pull?

    So I’d have something like.

    A table to store questions (like you have)
    A table to store options (like you have)
    A table to map questions to which set they belong to.

    And Some sort of data type to say
    if SetID = 1 and the order of answers are, “ringo, blue, yes, no” go to set 2
    if SetID = 1 and the order of answers are, “John, green, no, no” go to set 3
    if SetID = 1 defaults go to set 5

    But this would be really rigid because it would require knowing the possible answers before hand as well as the number of questions in the set. But I can think of no other way of doing it.

    #151557
    __
    Participant

    Can questions ever belong in more than one set?

    If not, you can add a set column in the question table – no need for another intersection table.

    if SetID = 1 and the order of answers are, “ringo, blue, yes, no” go to set 2 if SetID = 1 and the order of answers are, “John, green, no, no” go to set 3 if SetID = 1 defaults go to set 5

    hmm… it’s not exactly normalized, but you could use the answers as the pattern:

    question_map
    +---------+--------------+
    | answers | question_set |
    +=========+==============+
    | 6,2,T,F | 2            |
    +---------+--------------+
    | 4,1,F,F | 3            |
    +---------+--------------+
    

    The query would be a little complex, though, since you would have to either allow for missing characters (while still selecting the “best” match) or explicitly provide a default option in your query:

    SELECT id,q
      FROM question
      WHERE set IN(
        COALESCE( 
          (SELECT question_set 
            FROM question_map
            WHERE answers = '6,2,T,F')
         ,5  -- this is the "default" question set
        )
      )
    ;
    
    #151558
    FahimDaDream
    Participant

    Hmm,

    Sounds good, looks like that’s the only way I can really think of doing it. I’ll keep searching, I can’t be the only one wanting to do this so there must already be an existing solution out there.

    Thanks again Traq! You’ve been an awesome help. If there were a way for +1’s, thumbs up or something like that, you’d definitely get a couple of them :)

    #151559
    __
    Participant

    no prob, glad to help

    #151560
    FahimDaDream
    Participant

    Before I try this solution, I’m going to check out this particular open source application to use: http://www.limesurvey.org/en/

    This seems to do exactly what I want. If it works out, I’ll use this, otherwise I’ll build it myself.

Viewing 10 posts - 1 through 10 (of 10 total)
  • The forum ‘Back End’ is closed to new topics and replies.