KiXforms Forum Index KiXforms
The Forum for the KiXforms Community
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
 Quick Links 
Site News
Downloads
Documentation
Donations
Script Archive
Tracking Systems

IMDb parser (just a fragment so far)

 
Post new topic   Reply to topic    KiXforms Forum Index -> Script Archive
View previous topic :: View next topic  
Author Message
masken
KiXforms Enthusiast
KiXforms Enthusiast


Joined: 14 Mar 2003
Posts: 202
Location: Gothenburg, Sweden

PostPosted: Mon Jun 02, 2003 10:32 pm    Post subject: IMDb parser (just a fragment so far) Reply with quote

..something I'm working on just for the fun of it Smile It tries to parse an IMDb ID and put the results into text/combo boxes etc.

Code:


BREAK ON

$cr = CHR(13) + CHR(10)

;---------------------------------------
$formMain = CREATEOBJECT("Kixtart.Form")

$formMain.Caption = "IMDb title collector v.0.1"
$formMain.Width = 600
$formMain.Height = 400
$formMain.Center
$formMain.BackColor = $formMain.RGB(255,255,255)
$formMain.FontSize = 8
$formMain.FontName = "Arial"

;---------------------------------------
; Label the IMDb ID textbox

$lblIMDbID = $formMain.Label()
$lblIMDbID.Caption   = "Enter IMDb #:"
$lblIMDbID.Left      = 5
$lblIMDbID.Top      = 10
$lblIMDbID.Width   = 65

;---------------------------------------
; Textbox for the IMDb ID
$txtboxIMDbID = $formMain.TextBox()
$txtboxIMDbID.Left       = $lblIMDbID.Right
$txtboxIMDbID.Top        = 6
$txtboxIMDbID.Width      = 70
$txtboxIMDbID.Height      = 20
$txtboxIMDbID.FontBold      = 1
$txtboxIMDbID.ForeColor      = 255,0,0
$txtboxIMDbID.Value      = "0235326"

;---------------------------------------
; Button to grab IMDb information
$btnGetIMDb = $formMain.CommandButton()
$btnGetIMDb.Caption   = "IMDb Query"
$btnGetIMDb.Left   = $txtboxIMDbID.Right + 5
$btnGetIMDb.Top      = $txtboxIMDbID.Top
$btnGetIMDb.Width   = 120
$btnGetIMDb.Height   = $txtboxIMDbID.Height
$btnGetIMDb.Enabled   = 1
$btnGetIMDb.OnClick   = "GetIMDb()"

;---------------------------------------
; TextBoxes for IMDb information

;--- TITLE
$lblTitle = $formMain.Label()
$lblTitle.Caption   = "Title:"
$lblTitle.Left      = $lblIMDbID.Left
$lblTitle.Top      = $lblIMDbID.Bottom + 10
$lblTitle.Width      = $lblIMDbID.Width
$lblTitle.Height   = 20

$txtboxTitle = $formMain.TextBox()
$txtboxTitle.Left       = $txtboxIMDbID.Left
$txtboxTitle.Top        = $lblTitle.Top - 4
$txtboxTitle.Right      = $btnGetIMDb.Right
$txtboxTitle.Height      = 22 ; min height for a combobox with fontsize 8 seems to be 22
$txtboxTitle.FontBold      = 1
$txtboxTitle.FontSize      = 10

;--- YEAR
$lblYear = $formMain.Label()
$lblYear.Caption   = "Year:"
$lblYear.Left      = $txtboxTitle.Right + 5
$lblYear.Top      = $lblTitle.Top
$lblYear.Width      = 35
$lblYear.Height      = $lblTitle.Height

$txtboxYear = $formMain.TextBox()
$txtboxYear.Left    = $lblYear.Right
$txtboxYear.Top     = $txtboxTitle.Top
$txtboxYear.Width   = 42
$txtboxYear.Height   = $txtboxTitle.Height
$txtboxYear.FontBold   = 1
$txtboxYear.FontSize   = 10

;--- TITLE TAGS
$lblTags = $formMain.Label()
$lblTags.Caption   = "Title tags:"
$lblTags.Left      = $txtboxYear.Right + 5
$lblTags.Top      = $lblTitle.Top
$lblTags.Width      = 50
$lblTags.Height      = $lblTitle.Height

$cboboxTags = $formMain.ComboBox()
$cboboxTags.Left    = $lblTags.Right
$cboboxTags.Top     = $txtboxTitle.Top
$cboboxTags.Width   = $txtboxYear.Width + 10
;$cboboxTags.Height   = 15 ; $txtboxTitle.Height
;$cboboxTags.FontSize   = 8 ; 10
$cboboxTags.MultiColumn   = 0

;--- DIRECTED BY
$lblDirectors = $formMain.Label()
$lblDirectors.Caption   = "Director(s):"
$lblDirectors.Left   = $lblTitle.Left
$lblDirectors.Top   = $lblTitle.Bottom + 5
$lblDirectors.Width   = $lblIMDbID.Width
$lblDirectors.Height   = 18

$cboboxDirectors = $formMain.ComboBox()
$cboboxDirectors.Left       = $txtboxTitle.Left
$cboboxDirectors.Top        = $lblDirectors.Top - 2
$cboboxDirectors.Width      = 105
$cboboxDirectors.Height      = $lblDirectors.Height
$cboboxDirectors.MultiColumn   = 0
$cboboxDirectors.Style      = 0

;--- WRITTEN BY
$lblWriters = $formMain.Label()
$lblWriters.Caption   = " Writer(s):" ;|There's a bug with "W" & non-arial fonts...
$lblWriters.Left   = $cboboxDirectors.Right + 15
$lblWriters.Top      = $lblDirectors.Top
$lblWriters.Width   = 50
$lblWriters.Height   = $lblDirectors.Height

$cboboxWriters = $formMain.ComboBox()
$cboboxWriters.Left       = $lblWriters.Right
$cboboxWriters.Top        = $lblDirectors.Top - 2
$cboboxWriters.Width      = $cboboxDirectors.Width
$cboboxWriters.Height      = $cboboxDirectors.Height
$cboboxWriters.MultiColumn   = 0

;---------------------------------------
; Just a textbox for output verification
$txtboxTest = $formMain.TextBox()
$txtboxTest.Left      = $lblTitle.Left
$txtboxTest.Top         = $lblDirectors.Bottom + 5
$txtboxTest.Width      = $formMain.Width - 15
$txtboxTest.Height      = 100
$txtboxTest.MultiLine      = 1
$txtboxTest.FontSize      = 7




;---------------------------------------
;
$formMain.Show

WHILE $formMain.Visible
   $=EXECUTE($formMain.DoEvents())
LOOP

EXIT 1

;_______________________________________________________________________________
;===============================================================================
; Parses the data from the IMDb webpage, fetched by GetPage()
Function GetIMDb()
   $btnGetIMDb.Enabled   = 0
   $page = ""
   IF $txtboxIMDbID.Value = ""
      $null = MESSAGEBOX("You must enter a valid IMDb number", "Error", 16)
      EXIT 1
   ENDIF
   $URL    = $IMDbSite + $txtboxIMDbID.Value
   $page   = GetPage($URL)
   IF $page = ""
      $null = MESSAGEBOX("Failed to retrieve data from the URL: $URL" + $cr + $cr +
                "The site or your Internet connection may be down.)", "Error fetching IMDb data", 16)
      EXIT 1
   ENDIF
   ;--- Title & Year --------------
   ; TODO:         * Handle years like '(2001/I)' (it's handled now except the "/I" part...)
   ; CONSIDERATION:    * If the year is BEFORE any other parts within paranthesis,
   ;           it belongs to to the title, if NOT, it is technical movie info
   $TitleContents = SUBSTR($page, INSTR($page, "<title>") + 7, INSTR($page, "</title>") - (INSTR($page, "<title>") + 7))
   $TitleParts = SPLIT($TitleContents, "(")
   $TitlePartCount = 0
   FOR EACH $TitlePart IN $TitleParts
      $TitlePartCount = $TitlePartCount + 1
      $TitlePart = TRIM($TitlePart)
      ;? "TitlePart #" + $TitlePartCount + " " + CHR(34) + "$TitlePart" + CHR(34)
      IF $TitlePartCount = 1
         ;|First part is always the title
         $txtboxTitle.Value = $TitlePart
      ENDIF
      IF INSTR($TitlePart, ")")
         ;|We have a year or a tag
         $YearEval = VAL(SUBSTR($TitlePart, 1, 4))
         IF $YearEval <> 0
            ;|It's the year, or at least a 4-digit value
            $txtboxYear.Value = $YearEval
         ELSE
            ; we have a tag, not a year, rebuild the SPLIT left-paranthesis
            $TitlePart = "(" + $TitlePart
            $cboboxTags.AddItem("$TitlePart")
         ENDIF
      ENDIF
   NEXT
   
   ;--- Directed By ---------------
   ; TODO: Rev0: Nothing/done
   ;   Next: Some director's doesn't have URL's
   ; Movie with more than 1 director:
   ; http://akas.imdb.com/Details?0235326
   $Directors = SUBSTR($page, INSTR($page, "Directed by</b>") + 15, INSTR($page, "Writing credits") - INSTR($page, "Directed by</b><br>") + 15)
   $Directors = SPLIT($Directors, "<br>")
   FOR EACH $Director IN $Directors
      IF INSTR("$Director", "/Name?") <> 0
         ; the string contains a director name and link
         $Director = SUBSTR("$Director", INSTR("$Director", ">") + 1, INSTRREV("$Director", "<") - (INSTR("$Director", ">") + 1))
         ;? "Director: " + CHR(34) + $Director + CHR(34)
         $cboboxDirectors.AddItem("$Director")
      ENDIF
   NEXT
   
   ;--- Writing credits -----------
   ; TODO: Rev0: Nothing/done
   ;   Next: Some writers doesn't have URL's
   ; Movie with more than 1 writer:
   ; http://akas.imdb.com/Details?0235326
   $Writers = SUBSTR($page, INSTR($page, "Writing credits</b>") + 19, INSTR($page, "Genre:</b>") - (INSTR($page, "Writing credits</b>") + 19))
   $Writers = SPLIT($Writers, "<br>")
   FOR EACH $Writer IN $Writers
      IF INSTR("$Writer", "/Name?") <> 0
         ; the string contains a writer name and link
         $Writer = SUBSTR("$Writer", INSTR("$Writer", ">") + 1, INSTRREV("$Writer", "<") - (INSTR("$Writer", ">") + 1))
         ;? "Writer: " + CHR(34) + $Writer + CHR(34)
         $cboboxWriters.AddItem("$Writer")
      ENDIF
   NEXT


   ;Data retrieval on Details? pages:
   ;Data:         Starting point:            Ending point:            Comments:
   ;-----         ---------------            -------------            ---------
   ;Title, Year & Tags   1:st occurance of "<title>"      1:st occurance of "</title>"      Parse parts with SPLIT()
   ;Directed By      1:st occurance of "Directed By"      1:st occurance of "Writing credits"   Some titles miss "Directed By"
   ;Writing credits   1:st occurance of "Writing credits"   1:st occurance of "Genre:</b>"      Some titles miss "Writing credits"
   $btnGetIMDb.Enabled   = 1
EndFunction

;---------------------------------------
; Loads a webpage into a variable
Function GetPage($URL)
   DIM $htmldata
   $htmldata = CreateObject("microsoft.XMLhttp")
   $htmldata.open("GET",$URL,not 1)
   $htmldata.send
   $getpage=$htmldata.responsetext ;or responsebody
EndFunction
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    KiXforms Forum Index -> Script Archive All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group