Sep 5 23:39 2011 README Page 1 Sep 5 23:39 2011 table of contents Page 1 xv6 is a re−implementation of Dennis Ritchie’s and Ken Thompson’s Unix Version 6 (v6). xv6 loosely follows the structure and style of v6, but is implemented for a modern x86−based multiprocessor using ANSI C. The numbers to the left of the file names in the table are sheet numbers. The source code has been printed in a double column format with fifty lines per column, giving one hundred lines per sheet (or page). Thus there is a convenient relationship between line numbers and sheet numbers. ACKNOWLEDGMENTS xv6 is inspired by John Lions’s Commentary on UNIX 6th Edition (Peer to Peer Communications; ISBN: 1−57398−013−7; 1st edition (June 14, 2000)). See also http://pdos.csail.mit.edu/6.828/2007/v6.html, which provides pointers to on−line resources for v6. xv6 borrows code from the following sources: JOS (asm.h, elf.h, mmu.h, bootasm.S, ide.c, console.c, and others) Plan 9 (entryother.S, mp.h, mp.c, lapic.c) FreeBSD (ioapic.c) NetBSD (console.c) The following people made contributions: Russ Cox (context switching, locking) Cliff Frey (MP) Xiao Yu (MP) Nickolai Zeldovich Austin Clements In addition, we are grateful for the patches contributed by Greg Price, Yandong Mao, and Hitoshi Mitake. The code in the files that constitute xv6 is Copyright 2006−2011 Frans Kaashoek, Robert Morris, and Russ Cox. ERROR REPORTS # basic headers 01 types.h 01 param.h 02 memlayout.h 02 defs.h 04 x86.h 06 asm.h 07 mmu.h 09 elf.h # entering xv6 10 entry.S 11 entryother.S 12 main.c # locks 14 spinlock.h 14 spinlock.c # processes 16 vm.c 20 proc.h 21 proc.c 26 swtch.S 27 kalloc.c # system calls 28 traps.h 28 vectors.pl 29 trapasm.S 29 trap.c 31 syscall.h 31 syscall.c 33 sysproc.c # file system 34 buf.h 34 fcntl.h 35 stat.h 35 fs.h 36 file.h 36 ide.c 38 bio.c 40 log.c 42 fs.c 50 file.c 52 sysfile.c 57 exec.c # pipes 58 pipe.c # string operations 59 string.c # low−level hardware 61 mp.h 62 mp.c 64 lapic.c 66 ioapic.c 67 picirq.c 68 kbd.h 69 kbd.c 70 console.c 73 timer.c 74 uart.c # user−level 75 initcode.S 75 usys.S 76 init.c 76 sh.c # bootloader 82 bootasm.S 83 bootmain.c If you spot errors or have suggestions for improvement, please send email to Frans Kaashoek and Robert Morris (kaashoek,rtm@csail.mit.edu). BUILDING AND RUNNING XV6 To build xv6 on an x86 ELF machine (like Linux or FreeBSD), run "make". On non−x86 or non−ELF machines (like OS X, even on x86), you will need to install a cross−compiler gcc suite capable of producing x86 ELF binaries. See http://pdos.csail.mit.edu/6.828/2011/tools.html. Then run "make TOOLPREFIX=i386−jos−elf−". To run xv6, you can use Bochs or QEMU, both PC simulators. Bochs makes debugging easier, but QEMU is much faster. To run in Bochs, run "make bochs" and then type "c" at the bochs prompt. To run in QEMU, run "make qemu". To create a typeset version of the code, run "make xv6.pdf". This requires the "mpage" utility. See http://www.mesa.nl/pub/mpage/. The source listing is preceded by a cross−reference that lists every defined constant, struct, global variable, and function in xv6. Each entry gives, on the same line as the name, the line number (or, in a few cases, numbers) where the name is defined. Successive lines in an entry list the line numbers where the name is used. For example, this entry: swtch 2658 0374 2428 2466 2657 2658 indicates that swtch is defined on line 2658 and is mentioned on five lines on sheets 03, 24, and 26. Sep 5 23:39 2011 cross−references Page 1 Sep 5 23:39 2011 cross−references Page 2 acquire 1474 0377 1474 1478 2358 2417 2474 2566 2579 2766 3372 3392 3757 3979 4127 4145 4510 4539 4554 5041 5056 5863 7060 7216 7258 allocproc 2155 2155 2207 2260 allocuvm 1827 0422 1827 1841 5753 alltraps 2904 2859 2867 2880 2904 ALT 6810 6810 6838 6840 argfd 5213 5213 5256 5271 5306 argint 3195 0395 3195 3208 3356 3370 5218 5508 5576 5577 argptr 3204 0396 3204 5271 5657 argstr 3221 0397 3221 5318 5557 5575 5606 __attribute__ 1316 0270 0365 1209 BACK 7661 7661 7774 7920 backcmd 7696 7914 7696 7709 7775 8042 8155 8190 BACKSPACE 7150 7150 7167 7194 balloc 4304 4304 4326 4617 BBLOCK 3591 3591 4313 4338 B_BUSY 3409 3409 3808 3926 3941 3966 3976 B_DIRTY 3411 3411 3737 3766 bwrite 3964 0264 3964 3967 4175 bzero 4289 4289 4320 C 6831 7209 6831 6879 6904 6907 6908 6910 7222 7229 7240 CAPSLOCK 6812 6812 6845 6986 cgaputc 7155 7155 7198 clearpteu 1903 0431 1903 1909 cli 0557 0557 0559 1126 7189 8212 cmd 7665 7665 7677 7686 7693 7698 7702 7718 7723 7731 7751 7775 7777 7857 7858 7859 7864 7866 7868 7871 7872 7873 7876 7879 7880 7885 7886 7887 7900 7901 7903 7907 7908 7909 7914 7916 7918 7921 7922 8012 8015 8017 8021 8031 8034 8037 8046 8048 8050 8058 8060 8063 8078 8081 8085 8108 8112 8113 8122 8128 8137 8145 8151 8152 8166 8172 8173 8190 8191 8194 COM1 7413 7413 7423 7426 7429 7430 7431 7441 7457 7459 commit_trans 4136 0335 4136 5073 5346 5355 5445 5558 5562 5579 2160 2518 2781 3815 4457 4564 5884 7306 2323 2533 3016 3920 4490 5025 5905 2237 5743 2885 2903 5283 5294 3224 3332 5271 5283 5626 5283 5306 5408 5508 5626 1316 8189 7914 7916 7226 7232 4625 4629 3927 3938 3988 3771 3810 3829 3968 begin_trans 4125 0334 4125 5071 5413 5511 5556 bfree 4331 4331 4662 4672 bget 3916 3916 3946 3956 binit 3889 0261 1229 3889 bmap 4610 4610 4636 4719 bootmain 8317 8268 8317 BPB 3588 3588 3591 4312 bread 3952 0262 3952 4076 4104 4173 4282 4338 4411 4432 4668 4719 4769 brelse 3974 0263 3974 3977 4096 4112 4176 4319 4324 4345 4441 4525 4632 4773 BSIZE 3557 3557 3568 3582 4078 4174 4294 4721 4765 4769 buf 3400 0250 0262 0263 0333 1970 1973 3400 3404 3405 3676 3679 3725 3806 3809 3877 3891 3903 3915 3954 3964 3974 4077 4089 4090 4105 4111 4112 4269 4280 4291 4405 4429 4504 4705 4755 7029 7047 7203 7224 7301 7308 7784 7789 7803 7815 7820 7821 7825 B_VALID 3410 3410 3770 3810 5174 5323 5574 4675 4769 4314 4339 4077 4089 4293 4313 4517 4626 4080 4284 4417 4674 4081 4296 4420 4722 3588 4057 4719 4720 4770 4771 0264 1982 3406 3754 3881 3918 4005 4096 4159 4307 4613 7040 7238 7787 7816 0306 1984 3661 3804 3885 3951 4076 4104 4173 4333 4657 7044 7268 7788 7819 3829 3957 4079 4111 6905 6906 7209 7219 7269 5755 1560 7110 7687 7706 7737 7852 7860 7869 7874 7882 7888 7905 7910 7919 8013 8024 8039 8053 8064 8100 8116 8138 8161 8178 7692 7715 7741 7855 7863 7870 7875 7884 7889 7906 7913 7920 8014 8030 8042 8055 8075 8103 8121 8144 8164 8184 7427 7428 7434 7440 7467 7469 5179 5328 5452 5513 5583 CONSOLE 3639 3639 7321 7322 consoleinit 7316 0267 1225 7316 consoleintr 7212 0269 6998 7212 consoleread 7251 7251 7322 consolewrite 7301 7301 7321 consputc 7186 7016 7047 7069 7094 7095 7186 7239 7308 context 2043 0251 0374 2006 2188 2189 2190 2466 2628 copyout 1968 0430 1968 5763 copyuvm 1916 0427 1916 1927 cprintf 7052 0268 1222 1264 2630 2632 3040 3285 6319 6339 7052 7112 7113 cpu 2004 0309 1222 1264 1406 1466 1487 1561 1562 1570 1631 1637 1772 1775 2004 2014 2428 2459 2465 3015 3040 3041 3058 3060 6213 7112 cpunum 6501 0324 1256 1288 6673 6682 CR0_PE 0727 0727 1135 1171 CR0_PG 0737 0737 1050 1171 CR0_WP 0733 0733 1050 1171 CR4_PSE 0739 0739 1043 1164 create 5457 5457 5477 5490 7475 7087 7090 7226 7232 2043 2061 2191 2428 5774 1929 2264 1841 3053 6511 7114 2626 3058 6662 7117 1266 1508 1572 1773 2018 2466 3053 6214 1278 1546 1618 1774 2029 2467 3054 6511 1624 6501 8243 5494 5512 Sep 5 23:39 2011 cross−references Page 3 Sep 5 23:39 2011 cross−references Page 4 5557 5578 CRTPORT 7151 7151 7160 7161 7178 7179 7180 CTL 6809 6809 6835 6839 deallocuvm 1855 0423 1842 1855 DEVSPACE 0204 0204 1729 1742 devsw 3632 3632 3637 4708 4760 5007 7321 dinode 3572 3572 3582 4406 4433 4505 4518 dirent 3596 3596 4815 4855 dirlink 4852 0286 4822 4852 5339 5489 5493 dirlookup 4812 0287 4812 4818 5421 5467 DIRSIZ 3594 3594 3598 4805 4929 4991 5315 DPL_USER 0779 0779 1627 1628 2973 3068 3077 E0ESC 6816 6816 6970 6974 6980 elfhdr 0955 0955 5715 8319 ELF_MAGIC 0952 0952 5728 8330 ELF_PROG_LOAD 0986 0986 5739 enter_alloc 2725 0314 1299 1755 entry 1040 0961 1036 1039 2852 2853 5787 8345 8346 EOI 6414 6414 6484 6525 ERROR 6435 6435 6477 ESR 6417 fork 2254 0360 2254 3311 7625 7843 7845 fork1 7839 7700 7742 7754 7824 7839 forkret 2483 2117 2191 2483 freevm 1883 0424 1883 1888 5790 5795 gatedesc 0901 0523 0526 0901 getcallerpcs 1526 0378 1488 1526 getcmd 7784 7784 7815 gettoken 7956 7956 8041 8045 8071 8107 8111 growproc 2231 0361 2231 3359 havedisk1 3678 3678 3714 3812 holding 1544 0379 1477 1504 ialloc 4402 0288 4402 4422 IBLOCK 3585 3585 4411 4432 I_BUSY 3627 3627 4511 4513 4557 4559 ICRHI 6428 6428 6487 6556 ICRLO 6418 6418 6488 6489 6569 ID 6411 6411 6447 6516 IDE_BSY 3663 3663 3687 IDE_CMD_READ 3668 3668 3741 IDE_CMD_WRITE 3669 3669 3738 IDE_DF 3665 3665 3689 IDE_DRDY 3664 3664 3687 7162 7163 7181 6985 1889 2240 4710 4758 7322 4412 4430 5364 5404 4867 4875 5494 4859 4974 4872 4928 5405 5461 2214 2215 6975 6977 8324 2725 1040 2731 6121 8321 6417 6480 6481 exec 5710 0273 5642 5710 7630 7726 7727 EXEC 7657 7657 7722 7859 execcmd 7669 7853 7669 7710 7723 8121 8127 8128 exit 2304 0359 2304 2340 3069 3078 3317 7561 7626 7631 7735 7780 7828 EXTMEM 0202 0202 0208 fdalloc 5232 5232 5258 5526 fetchint 3167 0398 3167 3197 fetchstr 3179 0399 3179 3226 file 3600 0252 0276 0277 0281 0282 0351 4271 5004 5010 5026 5038 5039 5079 5102 5152 5216 5232 5253 5292 5303 5505 5821 7010 7408 7734 7864 7872 filealloc 5021 0276 5021 5526 fileclose 5052 0277 2315 5052 5528 5665 5666 filedup 5039 0278 2279 5039 fileinit 5014 0279 1230 5014 fileread 5102 0280 5102 5117 filestat 5079 0281 5079 5308 filewrite 5152 0282 5152 5184 FL_IF 0710 0710 1562 1568 6508 7568 7629 8165 7853 7855 8156 8166 3005 3009 7516 7519 7716 7725 7835 5662 5633 5639 0278 2064 5020 5052 5207 5267 5654 7678 8072 0280 3600 5023 5054 5213 5279 5806 7733 5827 5058 5297 5854 5856 5043 5260 5273 5189 5285 2218 2463 7560 7623 7761 7776 1940 2371 2961 2628 7115 8057 8070 8133 1544 2457 5476 5477 4517 4536 4540 6568 6557 6559 IDE_ERR 3666 3666 3689 ideinit 3701 0304 1232 ideintr 3752 0305 3024 idelock 3675 3675 3705 3815 3830 iderw 3804 0306 3804 3958 3969 idestart 3725 3679 3725 idewait 3683 3683 3708 idtinit 2979 0406 1265 idup 4488 0289 2280 iget 4453 4394 4418 4959 iinit 4389 0290 1231 ilock 4502 0291 4502 5082 5111 5351 5415 5479 5519 7283 7310 inb 0453 0453 3687 6967 7161 7441 7457 8231 8354 initlock 1462 0380 1462 3705 3893 5835 7318 initlog 4055 0332 2494 inituvm 1786 0425 1786 inode 3613 0253 0286 0291 0292 0297 0298 0426 1803 3633 3634 3701 3752 3757 3759 3778 3833 3809 3811 3813 3728 3776 3825 3730 3766 2979 4488 4961 4453 4473 4830 4389 4508 5175 5423 5608 4528 5325 5465 5722 4964 5338 5469 7263 3713 6354 6964 7163 7434 7440 7467 7469 8223 2125 2744 2975 4061 4391 5016 7319 4055 4058 1791 2211 0287 0293 0299 2065 4274 0288 0294 0300 3606 4385 0289 0295 0301 3613 4394 Sep 5 23:39 2011 cross−references Page 5 4401 4427 4452 4487 4488 4502 4574 4610 4654 4752 4811 4812 4953 4956 4988 5361 5403 5456 5554 5569 5604 7301 INPUT_BUF 7200 7200 7203 7224 7240 7268 insl 0462 0462 0464 3767 install_trans 4071 4071 4119 4140 INT_DISABLED 6619 6619 6667 ioapic 6627 6307 6329 6330 6636 6637 6643 IOAPIC 6608 6608 6658 ioapicenable 6673 0309 3707 6673 ioapicid 6217 0310 6217 6330 6662 ioapicinit 6651 0311 1224 6651 ioapicread 6634 6634 6659 6660 ioapicwrite 6641 6641 6667 6668 IO_PIC1 6707 6707 6720 6735 6752 6762 6776 IO_PIC2 6708 6708 6721 6736 6767 6770 6779 IO_RTC 6535 6535 6548 6549 IO_TIMER1 7359 7359 7368 7378 IPB 3582 3582 3585 3591 4518 iput 4552 0292 2320 4552 4860 4982 5072 IRQ_COM1 2833 4455 4534 4685 4852 4995 5460 5716 4461 4552 4702 4856 5316 5506 7251 7236 7238 8373 6624 6627 6644 6658 7326 7443 6347 6661 6662 6681 6682 6744 6747 6777 6765 6766 6780 7379 4412 4433 4558 4577 5344 5614 2833 3034 7442 IRQ_ERROR 2835 2835 6477 IRQ_IDE 2834 2834 3023 3027 IRQ_KBD 2832 2832 3030 7325 IRQ_SLAVE 6710 6710 6714 6752 IRQ_SPURIOUS 2836 2836 3039 6457 IRQ_TIMER 2831 2831 3014 3073 isdirempty 5361 5361 5368 5427 ismp 6215 0338 1233 6215 6340 6343 6655 itrunc 4654 4274 4561 4654 iunlock 4534 0293 4534 4537 5084 5114 5178 5613 7256 7305 iunlockput 4574 0294 4574 4966 5327 5340 5343 5439 5443 5451 5496 5521 5529 5610 5748 5797 iupdate 4427 0295 4427 4563 5333 5353 5437 5487 I_VALID 3628 3628 4516 4526 kalloc 2777 0315 1792 1794 1923 1931 1934 2777 5731 5829 KBDATAP 6804 6804 6967 kbdgetc 6956 6956 6998 kbdintr 6996 0321 3031 6996 KBS_DIB 6803 6803 6965 KBSTATP 6802 6802 6964 Sep 5 23:39 2011 cross−references Page 6 7443 3706 3707 7326 6767 6464 7380 6312 6320 6675 4576 4971 5334 5532 4975 5354 5468 5561 4978 5428 5472 5582 4680 4778 5442 5483 4555 1839 1846 2173 2209 KERNBASE 0207 0207 0208 0212 0218 0220 0221 1832 1889 2730 KERNLINK 0208 0208 1727 KEY_DEL 6828 6828 6869 6891 KEY_DN 6822 6822 6865 6887 KEY_END 6820 6820 6868 6890 KEY_HOME 6819 6819 6868 6890 KEY_INS 6827 6827 6869 6891 KEY_LF 6823 6823 6867 6889 KEY_PGDN 6826 6826 6866 6888 KEY_PGUP 6825 6825 6866 6888 KEY_RT 6824 6824 6867 6889 KEY_UP 6821 6821 6865 6887 kfree 2756 0316 1871 1873 2265 2369 2747 5852 5873 kill 2575 0362 2575 3059 kinit 2740 0317 1236 2740 KSTACKSIZE 0151 0151 1054 1063 2177 kvmalloc 1753 0418 1218 1753 lapiceoi 6522 0326 3021 3025 3042 6522 lapicinit 6451 0327 1220 1256 lapicstartap 6540 0328 1304 6540 lapicw 6444 6444 6457 6463 6468 6469 6474 6481 6484 6487 0213 0217 1321 1533 6915 6911 6914 6914 6915 6913 6912 6912 6913 6911 1893 1896 2756 2761 3334 7567 1300 1775 3032 3036 6451 6464 6465 6477 6480 6488 6493 6525 6556 6557 6569 lcr3 0590 0590 1764 1779 lgdt 0512 0512 0520 1133 lidt 0526 0526 0534 2981 LINT0 6433 6433 6468 LINT1 6434 6434 6469 LIST 7660 7660 7740 7907 listcmd 7690 7901 7690 7711 7741 8046 8157 8184 loadgs 0551 0551 1634 loaduvm 1803 0426 1803 1809 log 4040 4050 4040 4050 4061 4065 4075 4076 4092 4093 4094 4108 4109 4120 4129 4131 4132 4145 4146 4147 4165 4168 4169 4177 4178 logheader 4035 4035 4046 4057 4105 LOGSIZE 0160 0160 4037 4163 log_write 4159 0333 4159 4295 4416 4440 4630 ltr 0538 0538 0540 1776 mappages 1679 1679 1745 1794 MAXARG 0159 0159 5622 5714 MAXARGS 7663 7663 7671 7672 MAXFILE 3569 3569 4765 memcmp 5965 0386 5965 6245 6559 6568 1633 8241 8183 7901 7903 1812 5745 4063 4077 4104 4127 4138 4148 4172 4064 4089 4107 4128 4141 4163 4173 4058 4090 5167 4318 4344 4772 1846 1934 5760 8140 6288 Sep 5 23:39 2011 cross−references Page 7 Sep 5 23:39 2011 cross−references Page 8 memmove 5981 0387 1285 1795 4078 4174 4283 4721 4771 4929 6004 7173 memset 5954 0388 1666 1741 2190 2213 2733 4414 5432 5629 7787 7858 7869 7919 microdelay 6531 0329 6531 6558 7458 min 4273 4273 4720 4770 mp 6102 6102 6208 6237 6246 6255 6260 6268 6269 6280 6287 6294 6304 mpbcpu 6220 0339 1220 6220 MPBUS 6152 6152 6333 mpconf 6113 6113 6279 6282 mpconfig 6280 6280 6310 mpenter 1252 1252 1301 mpinit 6301 0340 1219 6301 mpioapic 6139 6139 6307 6329 MPIOAPIC 6153 6153 6328 MPIOINTR 6154 6154 6334 MPLINTR 6155 6155 6335 mpmain 1262 1209 1239 1257 mpproc 6128 6128 6306 6317 MPPROC 6151 6151 6316 mpsearch 6256 6256 6285 mpsearch1 6238 2418 2557 2580 NPTENTRIES 0822 0822 1867 NSEGS 2001 1611 2001 2008 nulterminate 8152 8015 8030 8152 8180 8185 8186 NUMLOCK 6813 6813 6846 O_CREATE 3453 3453 5510 8078 O_RDONLY 3450 3450 5520 8075 O_RDWR 3452 3452 5538 7614 outb 0471 0471 3711 3720 3733 3734 3735 3741 6353 6354 6720 6721 6735 6747 6752 6762 6767 6770 6776 6780 7160 7162 7180 7181 7377 7423 7426 7427 7430 7431 7459 8364 8365 8366 8369 outsl 0483 0483 0485 3739 outw 0477 0477 1181 1183 O_WRONLY 3451 3451 5537 5538 P2V 0218 0218 1726 6262 panic 7105 7832 0270 1478 1505 1691 1743 1778 1812 1871 1888 1929 2210 2310 2460 2462 2464 2731 2761 3055 3811 3813 3946 4058 4164 4166 4422 4473 4508 4558 4636 4818 4875 5043 5058 5189 5368 5426 1933 1982 4439 4524 4931 5981 1793 2764 5954 7885 1845 4294 7175 7906 6560 6570 6244 6264 6283 6310 6245 6265 6285 6350 6287 6305 6319 6339 6331 1262 6326 6238 6264 6268 6271 multiboot_header 1025 1024 1025 namecmp 4803 0296 4803 4825 5418 namei 4989 0297 2223 4989 5320 5606 5720 nameiparent 4996 0298 4954 4969 4981 5336 5410 5463 namex 4954 4954 4992 4998 NBUF 0155 0155 3881 3903 ncpu 6216 1222 1287 2019 3707 6318 6319 6323 6324 6345 NCPU 0152 0152 2018 6213 NDEV 0157 0157 4708 4758 5007 NDIRECT 3567 3567 3569 3578 3624 4620 4624 4625 4660 4668 4675 4676 NELEM 0434 0434 1744 2622 3282 nextpid 2116 2116 2169 NFILE 0154 0154 5010 5026 NINDIRECT 3568 3568 3569 4622 4670 NINODE 0156 0156 4385 4461 NO 6806 6806 6852 6855 6857 6859 6860 6862 6874 6879 6880 6881 6882 6902 6903 6905 6906 6908 NOFILE 0153 0153 2064 2277 2313 5236 NPDENTRIES 0821 0821 1317 1890 NPROC 0150 0150 2111 2161 2329 5517 4996 6216 6325 4615 4667 5631 6858 6877 6884 6907 5220 2362 2619 8173 8179 8191 8081 7616 7807 3731 3736 6548 6736 6765 6777 7178 7378 7428 8228 8367 3732 3738 6549 6744 6766 6779 7179 7379 7429 8236 8368 8274 8276 8078 8081 6550 7152 1569 1791 1909 2340 2506 3728 3967 4326 4528 4822 5117 5434 1571 1809 1927 2458 2509 3809 3977 4342 4537 4867 5184 5477 5490 5494 7063 7701 7720 7753 8028 8072 8106 8141 panicked 7018 7018 7118 7188 parseblock 8101 8101 8106 8125 parsecmd 8018 7702 7825 8018 parseexec 8117 8014 8055 8117 parseline 8035 8012 8024 8035 parsepipe 8051 8013 8039 8051 parseredirs 8064 8064 8112 8131 PCINT 6432 6432 6474 pde_t 0103 0103 0420 0421 0424 0425 0426 0431 1210 1270 1654 1656 1679 1739 1786 1803 1883 1903 1915 1952 1968 2055 PDX 0812 0812 1659 PDXSHIFT 0827 0812 0818 0827 peek 8001 8001 8025 8040 8069 8105 8109 PGROUNDDOWN 0830 0830 1685 1686 PGROUNDUP 0829 0829 1837 1863 5752 PGSIZE 0823 0823 0829 0830 1695 1696 1741 1794 1808 1810 1838 1845 1846 1925 1933 1934 2212 2219 2733 2760 2764 5753 PHYSTOP 0203 0203 1728 1742 7105 7112 7832 7845 8110 8136 8046 8108 8058 8142 0422 0427 1317 1733 1827 1916 5718 0423 0430 1610 1736 1855 1918 1321 8044 8056 8124 8132 1975 2732 2745 1316 1790 1814 1864 1979 2734 5755 1666 1793 1817 1867 1985 2746 1743 2746 Sep 5 23:39 2011 cross−references Page 9 2760 picenable 6725 0344 3706 6725 7442 picinit 6732 0345 1223 6732 picsetmask 6717 6717 6727 6783 pinit 2123 0363 1227 2123 pipe 5811 0254 0352 0353 5069 5109 5159 5829 5835 5839 5880 5901 7563 PIPE 7659 7659 7750 7886 pipealloc 5821 0351 5659 5821 pipeclose 5861 0352 5069 5861 pipecmd 7684 7880 7684 7712 7751 8058 8158 8178 piperead 5901 0353 5109 5901 PIPESIZE 5809 5809 5813 5886 pipewrite 5880 0354 5159 5880 popcli 1566 0383 1521 1566 1780 printint 7026 7026 7077 7081 proc 2053 0255 0358 0398 1205 1458 1606 1775 2015 2030 2106 2111 2114 2161 2204 2235 2243 2244 2257 2271 2272 2278 2284 2306 2309 2316 2320 2321 2330 2338 2355 2383 2389 2410 2428 2433 2461 2505 2523 2524 2557 2577 2580 7325 7380 0354 5811 5843 7752 3605 5823 5861 7753 8177 7880 7882 5894 5916 1569 1571 0399 1638 2053 2154 2237 2264 2279 2314 2326 2362 2418 2466 2528 2615 0428 1769 2059 2157 2240 2270 2280 2315 2329 2363 2425 2475 2555 2619 2955 3004 3006 3059 3060 3062 3077 3155 3167 3210 3226 3279 3286 3287 3306 3375 3657 4267 5220 5237 5238 5615 5633 5639 5781 5784 5785 5788 5789 5804 6211 6306 6317 6322 7013 7261 procdump 2604 0364 2604 7220 proghdr 0974 0974 5717 8320 PTE_ADDR 0844 0844 1661 1813 1930 1961 PTE_P 0833 0833 1319 1321 1690 1692 1868 1957 PTE_PS 0840 0840 1319 1321 pte_t 0847 0847 1653 1657 1683 1806 1857 1954 PTE_U 0835 0835 1670 1794 1934 1959 PTE_W 0834 0834 1319 1321 1728 1729 1794 PTX 0815 0815 1672 PTXSHIFT 0826 0815 0818 0826 pushcli 1555 0382 1476 1555 rcr2 0582 0582 3054 3061 readeflags 0544 0544 1559 1568 read_head 4087 4087 4118 readi 4702 0299 1818 4702 5112 5367 5368 Sep 5 23:39 2011 cross−references Page 10 3008 3068 3179 3281 3340 4961 5296 5664 5786 5887 6318 7410 3051 3073 3197 3283 3358 5205 5614 5704 5787 5907 6319 8334 1869 1892 1660 1670 1891 1928 1661 1663 1905 1919 1846 1910 1670 1726 1846 1934 1771 2463 6508 4821 4866 5726 5737 readsb 4278 0285 4062 4278 4311 4409 readsect 8360 8360 8395 readseg 8379 8314 8327 8338 8379 recover_from_log 4116 4052 4066 4116 REDIR 7658 7658 7730 7870 8171 redircmd 7675 7864 7675 7713 7731 7864 8075 8078 8081 8159 REG_ID 6610 6610 6660 REG_TABLE 6612 6612 6667 6668 6681 REG_VER 6611 6611 6659 release 1502 0381 1502 1505 2164 2377 2384 2435 2477 2519 2532 2568 2586 2770 2785 3019 3376 3394 3759 3778 3833 3942 3991 4132 4148 4480 4492 4514 4542 4569 5029 5033 5045 5066 5872 5875 5888 5908 5919 7101 7248 7282 7309 ROOTDEV 0158 0158 4062 4065 4959 ROOTINO 3556 3556 4959 run 2711 2611 2711 2712 2717 2767 2779 runcmd 7706 7706 7720 7737 7743 7759 7766 7777 7825 RUNNING 2050 2050 2427 2461 2611 safestrcpy 6032 0389 2222 2284 5781 sched 2453 0366 2339 2453 2458 2462 2464 2476 2525 scheduler 2408 4337 7866 8172 6682 2170 2487 2590 3381 3928 4464 4560 5060 5897 7262 2758 7745 3073 6032 2460 0365 1267 2006 2466 SCROLLLOCK 6814 6814 6847 SECTSIZE 8312 8312 8373 8386 SEG 0769 0769 1625 1626 1631 SEG16 0773 0773 1772 SEG_ASM 0660 0660 1190 1191 segdesc 0752 0509 0512 0752 1611 2008 seginit 1616 0417 1221 1255 SEG_KCODE 0741 0741 1150 1625 8253 SEG_KCPU 0743 0743 1631 1634 SEG_KDATA 0742 0742 1154 1626 8258 SEG_NULLASM 0654 0654 1189 8283 SEG_TSS 0746 0746 1772 1773 SEG_UCODE 0744 0744 1627 2214 SEG_UDATA 0745 0745 1628 2215 SETGATE 0921 0921 2972 2973 setupkvm 1734 0420 1734 1755 5731 SHIFT 6808 6808 6836 6837 skipelem 4915 4915 4963 sleep 2503 0367 2389 2503 2609 3379 3830 4512 5892 5911 spinlock 1401 0256 0367 0377 0381 0409 1401 2408 2428 8389 8394 1627 1628 8284 8285 0769 0773 1616 2972 2973 2916 1774 2913 1776 1923 2209 6985 2506 2509 3931 4129 7266 7579 0379 0380 1459 1462 Sep 5 23:39 2011 cross−references Page 11 Sep 5 23:39 2011 cross−references Page 12 1474 1502 1544 2107 2503 2709 2716 2958 3660 3675 3876 3880 4041 4268 4384 5005 5807 5812 7008 7021 7406 STA_R 0669 0786 0669 0786 1190 1625 8284 start 1125 7508 8211 1124 1125 1167 1175 4042 4063 4076 4089 4173 7507 7508 8210 8267 startothers 1274 1208 1235 1274 stat 3504 0257 0281 0300 3504 4685 5079 5203 5304 stati 4685 0300 4685 5083 STA_W 0668 0785 0668 0785 1191 1626 1631 8285 STA_X 0665 0782 0665 0782 1190 1625 8284 sti 0563 0563 0565 1573 2414 stosb 0492 0492 0494 5960 8340 stosl 0501 0501 0503 5958 strlen 6051 0390 5762 5763 6051 8023 strncmp 6008 0391 4805 6008 strncpy 6018 0392 4872 6018 STS_IG32 0800 0800 0927 STS_T32A 0797 0797 1772 STS_TG32 0801 0801 0927 sum 6226 6226 6228 6230 6232 6245 6292 superblock 3560 3111 3261 sys_kill 3328 3237 3256 3328 SYS_kill 3106 3106 3256 sys_link 5313 3238 3269 5313 SYS_link 3120 3120 3269 sys_mkdir 5551 3239 3270 5551 SYS_mkdir 3121 3121 3270 sys_mknod 5567 3240 3267 5567 SYS_mknod 3118 3118 3267 sys_open 5501 3241 3265 5501 SYS_open 3116 3116 3265 3280 3282 sys_pipe 5651 3242 3254 5651 SYS_pipe 3104 3104 3254 sys_read 5265 3243 3255 5265 SYS_read 3105 3105 3255 sys_sbrk 3351 3244 3262 3351 SYS_sbrk 3112 3112 3262 sys_sleep 3365 3245 3263 3365 SYS_sleep 3113 3113 3263 sys_unlink 5401 3246 3268 5401 SYS_unlink 3119 3119 3268 sys_uptime 3388 3249 3264 3388 SYS_uptime 3114 3114 3264 sys_wait 3322 3247 3253 3322 SYS_wait 3103 3103 3253 sys_write 5277 2110 2963 4003 5009 7202 1627 1177 4104 8211 4265 7603 1628 1627 7819 6233 0258 0285 3560 4060 4278 4308 4334 4407 SVR 6415 6415 6457 switchkvm 1762 0429 1254 1756 1762 2429 switchuvm 1769 0428 1769 1778 2244 2426 5789 swtch 2658 0374 2428 2466 2657 2658 syscall 3275 0400 3007 3157 3275 SYSCALL 7553 7560 7561 7562 7563 75 7560 7561 7562 7563 7564 7565 7566 7567 7568 7569 7570 7571 7572 7573 7574 7575 7576 7577 7578 7579 7580 sys_chdir 5601 3229 3259 5601 SYS_chdir 3109 3109 3259 sys_close 5289 3230 3271 5289 SYS_close 3122 3122 3271 sys_dup 5251 3231 3260 5251 SYS_dup 3110 3110 3260 sys_exec 5620 3232 3257 5620 SYS_exec 3107 3107 3257 7512 sys_exit 3315 3233 3252 3315 SYS_exit 3102 3102 3252 7517 sys_fork 3309 3234 3251 3309 SYS_fork 3101 3101 3251 sys_fstat 5301 3235 3258 5301 SYS_fstat 3108 3108 3258 sys_getpid 3338 3236 3261 3338 SYS_getpid 3111 3248 3266 5277 SYS_write 3117 3117 3266 taskstate 0851 0851 2007 TDCR 6439 6439 6463 T_DEV 3502 3502 4707 4757 T_DIR 3500 3500 4817 4965 5435 5485 5520 T_FILE 3501 3501 5470 5512 ticks 2964 0407 2964 3017 3374 3379 3393 tickslock 2963 0409 2963 2975 3372 3376 3379 3394 TICR 6437 6437 6465 TIMER 6429 6429 6464 TIMER_16BIT 7371 7371 7377 TIMER_DIV 7366 7366 7378 7379 TIMER_FREQ 7365 7365 7366 timerinit 7374 0403 1234 7374 TIMER_MODE 7368 7368 7377 TIMER_RATEGEN 7370 7370 7377 TIMER_SEL0 7369 7369 7377 T_IRQ0 2829 2829 3014 3023 3034 3038 3039 6464 6477 6667 6766 TPR 6413 6413 6493 trap 3001 2852 2854 2922 3055 3058 trapframe 0602 5578 5326 5427 5557 5609 3018 3373 3016 3019 3381 3392 3027 3030 3073 6457 6681 6747 3001 3053 Sep 5 23:39 2011 cross−references Page 13 0602 2060 2181 trapret 2927 2118 2186 2926 T_SYSCALL 2826 2826 2973 3003 7557 tvinit 2967 0408 1228 2967 uart 7415 7415 7436 7455 uartgetc 7463 7463 7475 uartinit 7418 0412 1226 7418 uartintr 7473 0413 3035 7473 uartputc 7451 0414 7195 7197 userinit 2202 0368 1237 2202 uva2ka 1952 0421 1952 1976 V2P 0217 0217 1727 1728 V2P_WO 0220 0220 1036 1046 3001 2927 7513 7518 7465 7447 7451 2210 VER 6412 6412 6473 wait 2353 0369 2353 3324 7744 7770 7771 waitdisk 8351 8351 8363 8372 wakeup 2564 0370 2564 3018 4147 4541 4566 5891 5896 5918 wakeup1 2553 2120 2326 2333 walkpgdir 1654 1654 1688 1811 1926 1956 write_head 4102 4102 4121 4139 writei 4752 0301 4752 4874 5434 xchg 0569 0569 1266 1483 yield 2472 0371 2472 3074 7562 7633 7826 3772 3989 5866 5869 7242 2553 2567 1865 1907 4142 5176 5433 1519 Sep 5 23:39 2011 xv6/types.h Page 1 Sep 5 23:39 2011 xv6/param.h Page 1 0100 0101 0102 0103 0104 0105 0106 0107 0108 0109 0110 0111 0112 0113 0114 0115 0116 0117 0118 0119 0120 0121 0122 0123 0124 0125 0126 0127 0128 0129 0130 0131 0132 0133 0134 0135 0136 0137 0138 0139 0140 0141 0142 0143 0144 0145 0146 0147 0148 0149 0150 0151 0152 0153 0154 0155 0156 0157 0158 0159 0160 0161 0162 0163 0164 0165 0166 0167 0168 0169 0170 0171 0172 0173 0174 0175 0176 0177 0178 0179 0180 0181 0182 0183 0184 0185 0186 0187 0188 0189 0190 0191 0192 0193 0194 0195 0196 0197 0198 0199 typedef typedef typedef typedef Sheet 01 unsigned int uint; unsigned short ushort; unsigned char uchar; uint pde_t; #define #define #define #define #define #define #define #define #define #define #define Sheet 01 NPROC 64 // maximum number of processes KSTACKSIZE 4096 // size of per−process kernel stack NCPU 8 // maximum number of CPUs NOFILE 16 // open files per process NFILE 100 // open files per system NBUF 10 // size of disk block cache NINODE 50 // maximum number of active i−nodes NDEV 10 // maximum major device number ROOTDEV 1 // device number of file system root disk MAXARG 32 // max exec arguments LOGSIZE 10 // max data sectors in on−disk log Sep 5 23:39 2011 xv6/memlayout.h Page 1 Sep 5 23:39 2011 xv6/defs.h Page 1 0200 0201 0202 0203 0204 0205 0206 0207 0208 0209 0210 0211 0212 0213 0214 0215 0216 0217 0218 0219 0220 0221 0222 0223 0224 0225 0226 0227 0228 0229 0230 0231 0232 0233 0234 0235 0236 0237 0238 0239 0240 0241 0242 0243 0244 0245 0246 0247 0248 0249 0250 0251 0252 0253 0254 0255 0256 0257 0258 0259 0260 0261 0262 0263 0264 0265 0266 0267 0268 0269 0270 0271 0272 0273 0274 0275 0276 0277 0278 0279 0280 0281 0282 0283 0284 0285 0286 0287 0288 0289 0290 0291 0292 0293 0294 0295 0296 0297 0298 0299 // Memory layout #define EXTMEM 0x100000 #define PHYSTOP 0xE000000 #define DEVSPACE 0xFE000000 // Start of extended memory // Top physical memory // Other devices are at high addresses // Key addresses for address space layout (see kmap in vm.c for layout) #define KERNBASE 0x80000000 // First kernel virtual address #define KERNLINK (KERNBASE+EXTMEM) // Address where kernel is linked #ifndef __ASSEMBLER__ static inline uint v2p(void *a) { return (uint) a − KERNBASE; } static inline void *p2v(uint a) { return (void *) a + KERNBASE; } #endif #define V2P(a) ((uint) a − KERNBASE) #define P2V(a) ((void *) a + KERNBASE) #define V2P_WO(x) ((x) − KERNBASE) #define P2V_WO(x) ((x) + KERNBASE) Sheet 02 // same as V2P, but without casts // same as V2P, but without casts struct struct struct struct struct struct struct struct struct buf; context; file; inode; pipe; proc; spinlock; stat; superblock; // bio.c void struct buf* void void binit(void); bread(uint, uint); brelse(struct buf*); bwrite(struct buf*); // console.c void void void void consoleinit(void); cprintf(char*, ...); consoleintr(int(*)(void)); panic(char*) __attribute__((noreturn)); // exec.c int exec(char*, char**); // file.c struct file* void struct file* void int int int filealloc(void); fileclose(struct file*); filedup(struct file*); fileinit(void); fileread(struct file*, char*, int n); filestat(struct file*, struct stat*); filewrite(struct file*, char*, int n); // fs.c void int struct inode* struct inode* struct inode* void void void void void void int struct inode* struct inode* int readsb(int dev, struct superblock *sb); dirlink(struct inode*, char*, uint); dirlookup(struct inode*, char*, uint*); ialloc(uint, short); idup(struct inode*); iinit(void); ilock(struct inode*); iput(struct inode*); iunlock(struct inode*); iunlockput(struct inode*); iupdate(struct inode*); namecmp(const char*, const char*); namei(char*); nameiparent(char*, char*); readi(struct inode*, char*, uint, uint); Sheet 02 Sep 5 23:39 2011 xv6/defs.h Page 2 Sep 5 23:39 2011 xv6/defs.h Page 3 0300 0301 0302 0303 0304 0305 0306 0307 0308 0309 0310 0311 0312 0313 0314 0315 0316 0317 0318 0319 0320 0321 0322 0323 0324 0325 0326 0327 0328 0329 0330 0331 0332 0333 0334 0335 0336 0337 0338 0339 0340 0341 0342 0343 0344 0345 0346 0347 0348 0349 0350 0351 0352 0353 0354 0355 0356 0357 0358 0359 0360 0361 0362 0363 0364 0365 0366 0367 0368 0369 0370 0371 0372 0373 0374 0375 0376 0377 0378 0379 0380 0381 0382 0383 0384 0385 0386 0387 0388 0389 0390 0391 0392 0393 0394 0395 0396 0397 0398 0399 void int stati(struct inode*, struct stat*); writei(struct inode*, char*, uint, uint); // ide.c void void void ideinit(void); ideintr(void); iderw(struct buf*); // ioapic.c void extern uchar void ioapicenable(int irq, int cpu); ioapicid; ioapicinit(void); // kalloc.c char* char* void void uint enter_alloc(void); kalloc(void); kfree(char*); kinit(void); detect_memory(void); // kbd.c void kbdintr(void); // lapic.c int extern volatile void void void void cpunum(void); uint* lapic; lapiceoi(void); lapicinit(int); lapicstartap(uchar, uint); microdelay(int); // log.c void void void void initlog(void); log_write(struct buf*); begin_trans(); commit_trans(); // mp.c extern int int void void ismp; mpbcpu(void); mpinit(void); mpstartthem(void); // picirq.c void void picenable(int); picinit(void); Sheet 03 // pipe.c int void int int pipealloc(struct file**, struct file**); pipeclose(struct pipe*, int); piperead(struct pipe*, char*, int); pipewrite(struct pipe*, char*, int); // proc.c struct proc* void int int int void void void void void void int void void copyproc(struct proc*); exit(void); fork(void); growproc(int); kill(int); pinit(void); procdump(void); scheduler(void) __attribute__((noreturn)); sched(void); sleep(void*, struct spinlock*); userinit(void); wait(void); wakeup(void*); yield(void); // swtch.S void swtch(struct context**, struct context*); // spinlock.c void void int void void void void acquire(struct spinlock*); getcallerpcs(void*, uint*); holding(struct spinlock*); initlock(struct spinlock*, char*); release(struct spinlock*); pushcli(void); popcli(void); // string.c int void* void* char* int int char* memcmp(const void*, const void*, uint); memmove(void*, const void*, uint); memset(void*, int, uint); safestrcpy(char*, const char*, int); strlen(const char*); strncmp(const char*, const char*, uint); strncpy(char*, const char*, int); // syscall.c int int int int int argint(int, int*); argptr(int, char**, int); argstr(int, char**); fetchint(struct proc*, uint, int*); fetchstr(struct proc*, uint, char**); Sheet 03 Sep 5 23:39 2011 xv6/defs.h Page 4 Sep 5 23:39 2011 xv6/x86.h Page 1 0400 0401 0402 0403 0404 0405 0406 0407 0408 0409 0410 0411 0412 0413 0414 0415 0416 0417 0418 0419 0420 0421 0422 0423 0424 0425 0426 0427 0428 0429 0430 0431 0432 0433 0434 0435 0436 0437 0438 0439 0440 0441 0442 0443 0444 0445 0446 0447 0448 0449 0450 0451 0452 0453 0454 0455 0456 0457 0458 0459 0460 0461 0462 0463 0464 0465 0466 0467 0468 0469 0470 0471 0472 0473 0474 0475 0476 0477 0478 0479 0480 0481 0482 0483 0484 0485 0486 0487 0488 0489 0490 0491 0492 0493 0494 0495 0496 0497 0498 0499 void syscall(void); // timer.c void timerinit(void); // trap.c void idtinit(void); extern uint ticks; void tvinit(void); extern struct spinlock tickslock; // uart.c void void void uartinit(void); uartintr(void); uartputc(int); // vm.c void void void pde_t* char* int int void void int pde_t* void void int void seginit(void); kvmalloc(void); vmenable(void); setupkvm(char* (*alloc)()); uva2ka(pde_t*, char*); allocuvm(pde_t*, uint, uint); deallocuvm(pde_t*, uint, uint); freevm(pde_t*); inituvm(pde_t*, char*, uint); loaduvm(pde_t*, char*, struct inode*, uint, uint); copyuvm(pde_t*, uint); switchuvm(struct proc*); switchkvm(void); copyout(pde_t*, uint, void*, uint); clearpteu(pde_t *pgdir, char *uva); // number of elements in fixed−size array #define NELEM(x) (sizeof(x)/sizeof((x)[0])) Sheet 04 // Routines to let C code use special x86 instructions. static inline uchar inb(ushort port) { uchar data; asm volatile("in %1,%0" : "=a" (data) : "d" (port)); return data; } static inline void insl(int port, void *addr, int cnt) { asm volatile("cld; rep insl" : "=D" (addr), "=c" (cnt) : "d" (port), "0" (addr), "1" (cnt) : "memory", "cc"); } static inline void outb(ushort port, uchar data) { asm volatile("out %0,%1" : : "a" (data), "d" (port)); } static inline void outw(ushort port, ushort data) { asm volatile("out %0,%1" : : "a" (data), "d" (port)); } static inline void outsl(int port, const void *addr, int cnt) { asm volatile("cld; rep outsl" : "=S" (addr), "=c" (cnt) : "d" (port), "0" (addr), "1" (cnt) : "cc"); } static inline void stosb(void *addr, int data, int cnt) { asm volatile("cld; rep stosb" : "=D" (addr), "=c" (cnt) : "0" (addr), "1" (cnt), "a" (data) : "memory", "cc"); } Sheet 04 Sep 5 23:39 2011 xv6/x86.h Page 2 Sep 5 23:39 2011 xv6/x86.h Page 3 0500 0501 0502 0503 0504 0505 0506 0507 0508 0509 0510 0511 0512 0513 0514 0515 0516 0517 0518 0519 0520 0521 0522 0523 0524 0525 0526 0527 0528 0529 0530 0531 0532 0533 0534 0535 0536 0537 0538 0539 0540 0541 0542 0543 0544 0545 0546 0547 0548 0549 0550 0551 0552 0553 0554 0555 0556 0557 0558 0559 0560 0561 0562 0563 0564 0565 0566 0567 0568 0569 0570 0571 0572 0573 0574 0575 0576 0577 0578 0579 0580 0581 0582 0583 0584 0585 0586 0587 0588 0589 0590 0591 0592 0593 0594 0595 0596 0597 0598 0599 static inline void stosl(void *addr, int data, int cnt) { asm volatile("cld; rep stosl" : "=D" (addr), "=c" (cnt) : "0" (addr), "1" (cnt), "a" (data) : "memory", "cc"); } struct segdesc; static inline void lgdt(struct segdesc *p, int size) { volatile ushort pd[3]; pd[0] = size−1; pd[1] = (uint)p; pd[2] = (uint)p >> 16; asm volatile("lgdt (%0)" : : "r" (pd)); } struct gatedesc; static inline void lidt(struct gatedesc *p, int size) { volatile ushort pd[3]; pd[0] = size−1; pd[1] = (uint)p; pd[2] = (uint)p >> 16; asm volatile("lidt (%0)" : : "r" (pd)); } static inline void ltr(ushort sel) { asm volatile("ltr %0" : : "r" (sel)); } static inline uint readeflags(void) { uint eflags; asm volatile("pushfl; popl %0" : "=r" (eflags)); return eflags; } Sheet 05 static inline void loadgs(ushort v) { asm volatile("movw %0, %%gs" : : "r" (v)); } static inline void cli(void) { asm volatile("cli"); } static inline void sti(void) { asm volatile("sti"); } static inline uint xchg(volatile uint *addr, uint newval) { uint result; // The + in "+m" denotes a read−modify−write operand. asm volatile("lock; xchgl %0, %1" : "+m" (*addr), "=a" (result) : "1" (newval) : "cc"); return result; } static inline uint rcr2(void) { uint val; asm volatile("movl %%cr2,%0" : "=r" (val)); return val; } static inline void lcr3(uint val) { asm volatile("movl %0,%%cr3" : : "r" (val)); } Sheet 05 Sep 5 23:39 2011 xv6/x86.h Page 4 Sep 5 23:39 2011 xv6/asm.h Page 1 0600 0601 0602 0603 0604 0605 0606 0607 0608 0609 0610 0611 0612 0613 0614 0615 0616 0617 0618 0619 0620 0621 0622 0623 0624 0625 0626 0627 0628 0629 0630 0631 0632 0633 0634 0635 0636 0637 0638 0639 0640 0641 0642 0643 0644 0645 0646 0647 0648 0649 0650 0651 0652 0653 0654 0655 0656 0657 0658 0659 0660 0661 0662 0663 0664 0665 0666 0667 0668 0669 0670 0671 0672 0673 0674 0675 0676 0677 0678 0679 0680 0681 0682 0683 0684 0685 0686 0687 0688 0689 0690 0691 0692 0693 0694 0695 0696 0697 0698 0699 // Layout of the trap frame built on the stack by the // hardware and by trapasm.S, and passed to trap(). struct trapframe { // registers as pushed by pusha uint edi; uint esi; uint ebp; uint oesp; // useless & ignored uint ebx; uint edx; uint ecx; uint eax; // rest of trap frame ushort gs; ushort padding1; ushort fs; ushort padding2; ushort es; ushort padding3; ushort ds; ushort padding4; uint trapno; // below here defined by x86 hardware uint err; uint eip; ushort cs; ushort padding5; uint eflags; // below here only when crossing rings, such as from user to kernel uint esp; ushort ss; ushort padding6; }; Sheet 06 // // assembler macros to create x86 segments // #define SEG_NULLASM .word 0, 0; .byte 0, 0, 0, 0 \ \ // The 0xC0 means the limit is in 4096−byte units // and (for executable segments) 32−bit mode. #define SEG_ASM(type,base,lim) .word (((lim) >> 12) & 0xffff), ((base) & 0xffff); .byte (((base) >> 16) & 0xff), (0x90 | (type)), (0xC0 | (((lim) >> 28) & 0xf)), (((base) >> 24) #define #define #define #define #define #define Sheet 06 STA_X STA_E STA_C STA_W STA_R STA_A 0x8 0x4 0x4 0x2 0x2 0x1 // // // // // // \ \ \ & 0xff) Executable segment Expand down (non−executable segments) Conforming code segment (executable only) Writeable (non−executable segments) Readable (executable segments) Accessed Sep 5 23:39 2011 xv6/mmu.h Page 1 Sep 5 23:39 2011 xv6/mmu.h Page 2 0700 0701 0702 0703 0704 0705 0706 0707 0708 0709 0710 0711 0712 0713 0714 0715 0716 0717 0718 0719 0720 0721 0722 0723 0724 0725 0726 0727 0728 0729 0730 0731 0732 0733 0734 0735 0736 0737 0738 0739 0740 0741 0742 0743 0744 0745 0746 0747 0748 0749 0750 0751 0752 0753 0754 0755 0756 0757 0758 0759 0760 0761 0762 0763 0764 0765 0766 0767 0768 0769 0770 0771 0772 0773 0774 0775 0776 0777 0778 0779 0780 0781 0782 0783 0784 0785 0786 0787 0788 0789 0790 0791 0792 0793 0794 0795 0796 0797 0798 0799 // This file contains definitions for the // x86 memory management unit (MMU). // Eflags register #define FL_CF #define FL_PF #define FL_AF #define FL_ZF #define FL_SF #define FL_TF #define FL_IF #define FL_DF #define FL_OF #define FL_IOPL_MASK #define FL_IOPL_0 #define FL_IOPL_1 #define FL_IOPL_2 #define FL_IOPL_3 #define FL_NT #define FL_RF #define FL_VM #define FL_AC #define FL_VIF #define FL_VIP #define FL_ID 0x00000001 0x00000004 0x00000010 0x00000040 0x00000080 0x00000100 0x00000200 0x00000400 0x00000800 0x00003000 0x00000000 0x00001000 0x00002000 0x00003000 0x00004000 0x00010000 0x00020000 0x00040000 0x00080000 0x00100000 0x00200000 // // // // // // // // // // // // // // // // // // // // // Carry Flag Parity Flag Auxiliary carry Flag Zero Flag Sign Flag Trap Flag Interrupt Enable Direction Flag Overflow Flag I/O Privilege Level bitmask IOPL == 0 IOPL == 1 IOPL == 2 IOPL == 3 Nested Task Resume Flag Virtual 8086 mode Alignment Check Virtual Interrupt Flag Virtual Interrupt Pending ID flag // Control Register flags #define CR0_PE 0x00000001 #define CR0_MP 0x00000002 #define CR0_EM 0x00000004 #define CR0_TS 0x00000008 #define CR0_ET 0x00000010 #define CR0_NE 0x00000020 #define CR0_WP 0x00010000 #define CR0_AM 0x00040000 #define CR0_NW 0x20000000 #define CR0_CD 0x40000000 #define CR0_PG 0x80000000 // // // // // // // // // // // Protection Enable Monitor coProcessor Emulation Task Switched Extension Type Numeric Errror Write Protect Alignment Mask Not Writethrough Cache Disable Paging #define CR4_PSE // Page size extension #define #define #define #define #define #define Sheet 07 SEG_KCODE SEG_KDATA SEG_KCPU SEG_UCODE SEG_UDATA SEG_TSS 0x00000010 1 2 3 4 5 6 // // // // // // kernel code kernel data+stack kernel per−cpu data user code user data+stack this process’s task state #ifndef __ASSEMBLER__ // Segment Descriptor struct segdesc { uint lim_15_0 : 16; uint base_15_0 : 16; uint base_23_16 : 8; uint type : 4; uint s : 1; uint dpl : 2; uint p : 1; uint lim_19_16 : 4; uint avl : 1; uint rsv1 : 1; uint db : 1; uint g : 1; uint base_31_24 : 8; }; // // // // // // // // // // // // // Low bits of segment limit Low bits of segment base address Middle bits of segment base address Segment type (see STS_ constants) 0 = system, 1 = application Descriptor Privilege Level Present High bits of segment limit Unused (available for software use) Reserved 0 = 16−bit segment, 1 = 32−bit segment Granularity: limit scaled by 4K when set High bits of segment base address // Normal segment #define SEG(type, base, lim, dpl) (struct segdesc) { ((lim) >> 12) & 0xffff, (uint)(base) & 0xffff, ((uint)(base) >> 16) & 0xff, type, 1, dpl, 1, (uint)(lim) >> 28, 0, 0, 1, 1, (uint)(base) >> 24 } #define SEG16(type, base, lim, dpl) (struct segdesc) { (lim) & 0xffff, (uint)(base) & 0xffff, ((uint)(base) >> 16) & 0xff, type, 1, dpl, 1, (uint)(lim) >> 16, 0, 0, 1, 0, (uint)(base) >> 24 } #endif #define DPL_USER 0x3 \ \ \ \ \ \ // User DPL // Application segment type #define STA_X 0x8 #define STA_E 0x4 #define STA_C 0x4 #define STA_W 0x2 #define STA_R 0x2 #define STA_A 0x1 bits // Executable segment // Expand down (non−executable segments) // Conforming code segment (executable only) // Writeable (non−executable segments) // Readable (executable segments) // Accessed // System segment type bits #define STS_T16A 0x1 #define STS_LDT 0x2 #define STS_T16B 0x3 #define STS_CG16 0x4 #define STS_TG 0x5 #define STS_IG16 0x6 #define STS_TG16 0x7 #define STS_T32A 0x9 #define STS_T32B 0xB #define STS_CG32 0xC // // // // // // // // // // Sheet 07 Available 16−bit Local Descriptor Busy 16−bit TSS 16−bit Call Gate Task Gate / Coum 16−bit Interrupt 16−bit Trap Gate Available 32−bit Busy 32−bit TSS 32−bit Call Gate TSS Table Transmitions Gate TSS Sep 5 23:39 2011 xv6/mmu.h Page 3 Sep 5 23:39 2011 xv6/mmu.h Page 4 0800 0801 0802 0803 0804 0805 0806 0807 0808 0809 0810 0811 0812 0813 0814 0815 0816 0817 0818 0819 0820 0821 0822 0823 0824 0825 0826 0827 0828 0829 0830 0831 0832 0833 0834 0835 0836 0837 0838 0839 0840 0841 0842 0843 0844 0845 0846 0847 0848 0849 0850 // Task state segment format 0851 struct taskstate { 0852 uint link; // Old ts selector 0853 uint esp0; // Stack pointers and segment selectors 0854 ushort ss0; // after an increase in privilege level 0855 ushort padding1; 0856 uint *esp1; 0857 ushort ss1; 0858 ushort padding2; 0859 uint *esp2; 0860 ushort ss2; 0861 ushort padding3; 0862 void *cr3; // Page directory base 0863 uint *eip; // Saved state from last task switch 0864 uint eflags; 0865 uint eax; // More saved state (registers) 0866 uint ecx; 0867 uint edx; 0868 uint ebx; 0869 uint *esp; 0870 uint *ebp; 0871 uint esi; 0872 uint edi; 0873 ushort es; // Even more saved state (segment selectors) 0874 ushort padding4; 0875 ushort cs; 0876 ushort padding5; 0877 ushort ss; 0878 ushort padding6; 0879 ushort ds; 0880 ushort padding7; 0881 ushort fs; 0882 ushort padding8; 0883 ushort gs; 0884 ushort padding9; 0885 ushort ldt; 0886 ushort padding10; 0887 ushort t; // Trap on task switch 0888 ushort iomb; // I/O map base address 0889 }; 0890 0891 0892 0893 0894 0895 0896 0897 0898 0899 #define STS_IG32 #define STS_TG32 // // // // // // // 0xE 0xF // 32−bit Interrupt Gate // 32−bit Trap Gate A virtual address ’la’ has a three−part structure as follows: +−−−−−−−−10−−−−−−+−−−−−−−10−−−−−−−+−−−−−−−−−12−−−−−−−−−−+ | Page Directory | Page Table | Offset within Page | | Index | Index | | +−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−+ \−−− PDX(va) −−/ \−−− PTX(va) −−/ // page directory index #define PDX(va) (((uint)(va) >> PDXSHIFT) & 0x3FF) // page table index #define PTX(va) (((uint)(va) >> PTXSHIFT) & 0x3FF) // construct virtual address from indexes and offset #define PGADDR(d, t, o) ((uint)((d) << PDXSHIFT | (t) << PTXSHIFT | (o))) // Page #define #define #define directory and page table constants. NPDENTRIES 1024 // # directory entries per page directory NPTENTRIES 1024 // # PTEs per page table PGSIZE 4096 // bytes mapped by a page #define PGSHIFT #define PTXSHIFT #define PDXSHIFT 12 12 22 // log2(PGSIZE) // offset of PTX in a linear address // offset of PDX in a linear address #define PGROUNDUP(sz) (((sz)+PGSIZE−1) & ~(PGSIZE−1)) #define PGROUNDDOWN(a) (((a)) & ~(PGSIZE−1)) // Page #define #define #define #define #define #define #define #define #define table/directory PTE_P PTE_W PTE_U PTE_PWT PTE_PCD PTE_A PTE_D PTE_PS PTE_MBZ entry flags. 0x001 // Present 0x002 // Writeable 0x004 // User 0x008 // Write−Through 0x010 // Cache−Disable 0x020 // Accessed 0x040 // Dirty 0x080 // Page Size 0x180 // Bits must be zero // Address in page table or page directory entry #define PTE_ADDR(pte) ((uint)(pte) & ~0xFFF) #ifndef __ASSEMBLER__ typedef uint pte_t; Sheet 08 Sheet 08 Sep 5 23:39 2011 xv6/mmu.h Page 5 Sep 5 23:39 2011 xv6/elf.h Page 1 0900 0901 0902 0903 0904 0905 0906 0907 0908 0909 0910 0911 0912 0913 0914 0915 0916 0917 0918 0919 0920 0921 0922 0923 0924 0925 0926 0927 0928 0929 0930 0931 0932 0933 0934 0935 0936 0937 0938 0939 0940 0941 0942 0943 0944 0945 0946 0947 0948 0949 0950 0951 0952 0953 0954 0955 0956 0957 0958 0959 0960 0961 0962 0963 0964 0965 0966 0967 0968 0969 0970 0971 0972 0973 0974 0975 0976 0977 0978 0979 0980 0981 0982 0983 0984 0985 0986 0987 0988 0989 0990 0991 0992 0993 0994 0995 0996 0997 0998 0999 // Gate descriptors for struct gatedesc { uint off_15_0 : 16; uint cs : 16; uint args : 5; uint rsv1 : 3; uint type : 4; uint s : 1; uint dpl : 2; uint p : 1; uint off_31_16 : 16; }; interrupts and traps // // // // // // // // // low 16 bits of offset in segment code segment selector # args, 0 for interrupt/trap gates reserved(should be zero I guess) type(STS_{TG,IG32,TG32}) must be 0 (system) descriptor(meaning new) privilege level Present high bits of offset in segment // Set up a normal interrupt/trap gate descriptor. // − istrap: 1 for a trap (= exception) gate, 0 for an interrupt gate. // interrupt gate clears FL_IF, trap gate leaves FL_IF alone // − sel: Code segment selector for interrupt/trap handler // − off: Offset in code segment for interrupt/trap handler // − dpl: Descriptor Privilege Level − // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ { \ (gate).off_15_0 = (uint)(off) & 0xffff; \ (gate).cs = (sel); \ (gate).args = 0; \ (gate).rsv1 = 0; \ (gate).type = (istrap) ? STS_TG32 : STS_IG32; \ (gate).s = 0; \ (gate).dpl = (d); \ (gate).p = 1; \ (gate).off_31_16 = (uint)(off) >> 16; \ } #endif Sheet 09 // Format of an ELF executable file #define ELF_MAGIC 0x464C457FU // "\x7FELF" in little endian // File header struct elfhdr { uint magic; // must equal ELF_MAGIC uchar elf[12]; ushort type; ushort machine; uint version; uint entry; uint phoff; uint shoff; uint flags; ushort ehsize; ushort phentsize; ushort phnum; ushort shentsize; ushort shnum; ushort shstrndx; }; // Program section header struct proghdr { uint type; uint off; uint vaddr; uint paddr; uint filesz; uint memsz; uint flags; uint align; }; // Values for Proghdr type #define ELF_PROG_LOAD // Flag #define #define #define Sheet 09 1 bits for Proghdr flags ELF_PROG_FLAG_EXEC 1 ELF_PROG_FLAG_WRITE 2 ELF_PROG_FLAG_READ 4 Sep 5 23:39 2011 xv6/entry.S Page 1 Sep 5 23:39 2011 xv6/entry.S Page 2 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 orl $(CR0_PG|CR0_WP), %eax 1051 movl %eax, %cr0 1052 1053 # Set up the stack pointer. 1054 movl $(stack + KSTACKSIZE), %esp 1055 1056 # Jump to main(), and switch to executing at 1057 # high addresses. The indirect call is needed because 1058 # the assembler produces a PC−relative instruction 1059 # for a direct jump. 1060 mov $main, %eax 1061 jmp *%eax 1062 1063 .comm stack, KSTACKSIZE 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 # # # # # # # # # # # # # # # Multiboot header, for multiboot boot loaders like GNU Grub. http://www.gnu.org/software/grub/manual/multiboot/multiboot.html Using GRUB 2, you can boot xv6 from a file stored in a Linux file system by copying kernel or kernelmemfs to /boot and then adding this menu entry: menuentry "xv6" { insmod ext2 set root=’(hd0,msdos1)’ set kernel=’/boot/kernel’ echo "Loading ${kernel}..." multiboot ${kernel} ${kernel} boot } #include #include #include #include "asm.h" "memlayout.h" "mmu.h" "param.h" # Multiboot header. Data to direct multiboot loader. .p2align 2 .text .globl multiboot_header multiboot_header: #define magic 0x1badb002 #define flags 0 .long magic .long flags .long (−magic−flags) # By convention, the _start symbol specifies the ELF entry point. # Since we haven’t set up virtual memory yet, our entry point is # the physical address of ’entry’. .globl _start _start = V2P_WO(entry) # Entering xv6 on boot processor. Machine is mostly set up. .globl entry entry: # Turn on page size extension for 4Mbyte pages movl %cr4, %eax orl $(CR4_PSE), %eax movl %eax, %cr4 # Set page directory movl $(V2P_WO(entrypgdir)), %eax movl %eax, %cr3 # Turn on paging. movl %cr0, %eax Sheet 10 Sheet 10 Sep 5 23:39 2011 xv6/entryother.S Page 1 Sep 5 23:39 2011 xv6/entryother.S Page 2 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 #include "asm.h" #include "memlayout.h" #include "mmu.h" # # # # # # # # # # # # # # # # # # Each non−boot CPU ("AP") is started up in response to a STARTUP IPI from the boot CPU. Section B.4.2 of the Multi−Processor Specification says that the AP will start in real mode with CS:IP set to XY00:0000, where XY is an 8−bit value sent with the STARTUP. Thus this code must start at a 4096−byte boundary. Because this code sets DS to zero, it must sit at an address in the low 2^16 bytes. Startothers (in main.c) sends the STARTUPs one at a time. It copies this code (start) at 0x7000. It puts the address of a newly allocated per−core stack in start−4,the address of the place to jump to (mpenter) in start−8, and the physical address of entrypgdir in start−12. This code is identical to bootasm.S except: − it does not need to enable A20 − it uses the address at start−4, start−8, and start−12 .code16 .globl start start: cli xorw movw movw movw %ax,%ax %ax,%ds %ax,%es %ax,%ss lgdt movl orl movl gdtdesc %cr0, %eax $CR0_PE, %eax %eax, %cr0 Sheet 11 ljmpl .code32 start32: movw movw movw movw movw movw movw $(SEG_KCODE<<3), $(start32) $(SEG_KDATA<<3), %ax %ax, %ds %ax, %es %ax, %ss $0, %ax %ax, %fs %ax, %gs # Turn on page size extension for 4Mbyte pages movl %cr4, %eax orl $(CR4_PSE), %eax movl %eax, %cr4 # Use enterpgdir as our initial page table movl (start−12), %eax movl %eax, %cr3 # Turn on paging. movl %cr0, %eax orl $(CR0_PE|CR0_PG|CR0_WP), %eax movl %eax, %cr0 # Switch to the stack allocated by startothers() movl (start−4), %esp # Call mpenter() call *(start−8) movw movw outw movw outw spin: jmp $0x8a00, %ax %ax, %dx %ax, %dx $0x8ae0, %ax %ax, %dx spin .p2align 2 gdt: SEG_NULLASM SEG_ASM(STA_X|STA_R, 0, 0xffffffff) SEG_ASM(STA_W, 0, 0xffffffff) gdtdesc: .word (gdtdesc − gdt − 1) .long gdt Sheet 11 Sep 5 23:39 2011 xv6/main.c Page 1 Sep 5 23:39 2011 xv6/main.c Page 2 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "proc.h" "x86.h" static void startothers(void); static void mpmain(void) __attribute__((noreturn)); extern pde_t *kpgdir; // Bootstrap processor starts running C code here. // Allocate a real stack and switch to it, first // doing some setup required for memory allocator to work. int main(void) { kvmalloc(); // kernel page table mpinit(); // collect info about this machine lapicinit(mpbcpu()); seginit(); // set up segments cprintf("\ncpu%d: starting xv6\n\n", cpu−>id); picinit(); // interrupt controller ioapicinit(); // another interrupt controller consoleinit(); // I/O devices & their interrupts uartinit(); // serial port pinit(); // process table tvinit(); // trap vectors binit(); // buffer cache fileinit(); // file table iinit(); // inode cache ideinit(); // disk if(!ismp) timerinit(); // uniprocessor timer startothers(); // start other processors (must come before kinit) kinit(); // initialize memory allocator userinit(); // first user process (must come after kinit) // Finish setting up this processor in mpmain. mpmain(); } Sheet 12 // Other CPUs jump here from entryother.S. static void mpenter(void) { switchkvm(); seginit(); lapicinit(cpunum()); mpmain(); } // Common CPU setup code. static void mpmain(void) { cprintf("cpu%d: starting\n", cpu−>id); idtinit(); // load idt register xchg(&cpu−>started, 1); // tell startothers() we’re up scheduler(); // start running processes } pde_t entrypgdir[]; // For entry.S // Start the non−boot (AP) processors. static void startothers(void) { extern uchar _binary_entryother_start[], _binary_entryother_size[]; uchar *code; struct cpu *c; char *stack; // Write entry code to unused memory at 0x7000. // The linker has placed the image of entryother.S in // _binary_entryother_start. code = p2v(0x7000); memmove(code, _binary_entryother_start, (uint)_binary_entryother_size); for(c = cpus; c < cpus+ncpu; c++){ if(c == cpus+cpunum()) // We’ve started already. continue; Sheet 12 // Tell entryother.S what stack to use, where to enter, and what // pgdir to use. We cannot use kpgdir yet, because the AP processor // is running in low memory, so we use entrypgdir for the APs too. // kalloc can return addresses above 4Mbyte (the machine may have // much more physical memory than 4Mbyte), which aren’t mapped by // entrypgdir, so we must allocate a stack using enter_alloc(); // this introduces the constraint that xv6 cannot use kalloc until // after these last enter_alloc invocations. stack = enter_alloc(); Sep 5 23:39 2011 xv6/main.c Page 3 Sep 5 23:39 2011 xv6/main.c Page 4 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 // Blank page. 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 *(void**)(code−4) = stack + KSTACKSIZE; *(void**)(code−8) = mpenter; *(int**)(code−12) = (void *) v2p(entrypgdir); lapicstartap(c−>id, v2p(code)); // wait for cpu to finish mpmain() while(c−>started == 0) ; } } // Boot page table used in entry.S and entryother.S. // Page directories (and page tables), must start on a page boundary, // hence the "__aligned__" attribute. // Use PTE_PS in page directory entry to enable 4Mbyte pages. __attribute__((__aligned__(PGSIZE))) pde_t entrypgdir[NPDENTRIES] = { // Map VA’s [0, 4MB) to PA’s [0, 4MB) [0] = (0) + PTE_P + PTE_W + PTE_PS, // Map VA’s [KERNBASE, KERNBASE+4MB) to PA’s [0, 4MB) [KERNBASE>>PDXSHIFT] = (0) + PTE_P + PTE_W + PTE_PS, }; Sheet 13 Sheet 13 Sep 5 23:39 2011 xv6/spinlock.h Page 1 Sep 5 23:39 2011 xv6/spinlock.c Page 1 1400 // Mutual exclusion 1401 struct spinlock { 1402 uint locked; 1403 1404 // For debugging: 1405 char *name; 1406 struct cpu *cpu; 1407 uint pcs[10]; 1408 1409 }; 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 Sheet 14 lock. // Is the lock held? // // // // Name of lock. The cpu holding the lock. The call stack (an array of program counters) that locked the lock. // Mutual exclusion spin locks. #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "x86.h" "memlayout.h" "mmu.h" "proc.h" "spinlock.h" void initlock(struct spinlock *lk, char *name) { lk−>name = name; lk−>locked = 0; lk−>cpu = 0; } // Acquire the lock. // Loops (spins) until the lock is acquired. // Holding a lock for a long time may cause // other CPUs to waste time spinning to acquire it. void acquire(struct spinlock *lk) { pushcli(); // disable interrupts to avoid deadlock. if(holding(lk)) panic("acquire"); // The xchg is atomic. // It also serializes, so that reads after acquire are not // reordered before it. while(xchg(&lk−>locked, 1) != 0) ; // Record info about lock acquisition for debugging. lk−>cpu = cpu; getcallerpcs(&lk, lk−>pcs); } Sheet 14 Sep 5 23:39 2011 xv6/spinlock.c Page 2 Sep 5 23:39 2011 xv6/spinlock.c Page 3 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 // Release the lock. void release(struct spinlock *lk) { if(!holding(lk)) panic("release"); lk−>pcs[0] = 0; lk−>cpu = 0; // The xchg serializes, so that reads before release are // not reordered after it. The 1996 PentiumPro manual (Volume 3, // 7.2) says reads can be carried out speculatively and in // any order, which implies we need to serialize here. // But the 2007 Intel 64 Architecture Memory Ordering White // Paper says that Intel 64 and IA−32 will not move a load // after a store. So lock−>locked = 0 would work here. // The xchg being asm volatile ensures gcc emits it after // the above assignments (and after the critical section). xchg(&lk−>locked, 0); popcli(); } // Record the current call stack in pcs[] by following the %ebp chain. void getcallerpcs(void *v, uint pcs[]) { uint *ebp; int i; ebp = (uint*)v − 2; for(i = 0; i < 10; i++){ if(ebp == 0 || ebp < (uint*)KERNBASE || ebp == (uint*)0xffffffff) break; pcs[i] = ebp[1]; // saved %eip ebp = (uint*)ebp[0]; // saved %ebp } for(; i < 10; i++) pcs[i] = 0; } // Check whether this cpu is holding the lock. int holding(struct spinlock *lock) { return lock−>locked && lock−>cpu == cpu; } Sheet 15 // Pushcli/popcli are like cli/sti except that they are matched: // it takes two popcli to undo two pushcli. Also, if interrupts // are off, then pushcli, popcli leaves them off. void pushcli(void) { int eflags; eflags = readeflags(); cli(); if(cpu−>ncli++ == 0) cpu−>intena = eflags & FL_IF; } void popcli(void) { if(readeflags()&FL_IF) panic("popcli − interruptible"); if(−−cpu−>ncli < 0) panic("popcli"); if(cpu−>ncli == 0 && cpu−>intena) sti(); } Sheet 15 Sep 5 23:39 2011 xv6/vm.c Page 1 Sep 5 23:39 2011 xv6/vm.c Page 2 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 #include #include #include #include #include #include #include #include "param.h" "types.h" "defs.h" "x86.h" "memlayout.h" "mmu.h" "proc.h" "elf.h" extern char data[]; // defined by kernel.ld pde_t *kpgdir; // for use in scheduler() struct segdesc gdt[NSEGS]; // Set up CPU’s kernel segment descriptors. // Run once on entry on each CPU. void seginit(void) { struct cpu *c; // Map "logical" addresses to virtual addresses using identity map. // Cannot share a CODE descriptor for both kernel and user // because it would have to have DPL_USR, but the CPU forbids // an interrupt from CPL=0 to DPL=3. c = &cpus[cpunum()]; c−>gdt[SEG_KCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, 0); c−>gdt[SEG_KDATA] = SEG(STA_W, 0, 0xffffffff, 0); c−>gdt[SEG_UCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, DPL_USER); c−>gdt[SEG_UDATA] = SEG(STA_W, 0, 0xffffffff, DPL_USER); // Map cpu, and curproc c−>gdt[SEG_KCPU] = SEG(STA_W, &c−>cpu, 8, 0); lgdt(c−>gdt, sizeof(c−>gdt)); loadgs(SEG_KCPU << 3); // Initialize cpu−local storage. cpu = c; proc = 0; } Sheet 16 // Return the address of the PTE in page table pgdir // that corresponds to virtual address va. If alloc!=0, // create any required page table pages. static pte_t * walkpgdir(pde_t *pgdir, const void *va, char* (*alloc)(void)) { pde_t *pde; pte_t *pgtab; pde = &pgdir[PDX(va)]; if(*pde & PTE_P){ pgtab = (pte_t*)p2v(PTE_ADDR(*pde)); } else { if(!alloc || (pgtab = (pte_t*)alloc()) == 0) return 0; // Make sure all those PTE_P bits are zero. memset(pgtab, 0, PGSIZE); // The permissions here are overly generous, but they can // be further restricted by the permissions in the page table // entries, if necessary. *pde = v2p(pgtab) | PTE_P | PTE_W | PTE_U; } return &pgtab[PTX(va)]; } // Create PTEs for virtual addresses starting at va that refer to // physical addresses starting at pa. va and size might not // be page−aligned. static int mappages(pde_t *pgdir, void *va, uint size, uint pa, int perm, char* (*alloc)(void)) { char *a, *last; pte_t *pte; a = (char*)PGROUNDDOWN((uint)va); last = (char*)PGROUNDDOWN(((uint)va) + size − 1); for(;;){ if((pte = walkpgdir(pgdir, a, alloc)) == 0) return −1; if(*pte & PTE_P) panic("remap"); *pte = pa | perm | PTE_P; if(a == last) break; a += PGSIZE; pa += PGSIZE; } return 0; } Sheet 16 Sep 5 23:39 2011 xv6/vm.c Page 3 Sep 5 23:39 2011 xv6/vm.c Page 4 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 // The mappings from logical to virtual are one to one (i.e., // segmentation doesn’t do anything). There is one page table per // process, plus one that’s used when a CPU is not running any process // (kpgdir). A user process uses the same page table as the kernel; the // page protection bits prevent it from accessing kernel memory. // // setupkvm() and exec() set up every page table like this: // 0..KERNBASE: user memory (text+data+stack+heap), mapped to some free // phys memory // KERNBASE..KERNBASE+EXTMEM: mapped to 0..EXTMEM (for I/O space) // KERNBASE+EXTMEM..KERNBASE+end: mapped to EXTMEM..end kernel, // w. no write permission // KERNBASE+end..KERBASE+PHYSTOP: mapped to end..PHYSTOP, // rw data + free memory // 0xfe000000..0: mapped direct (devices such as ioapic) // // The kernel allocates memory for its heap and for user memory // between KERNBASE+end and the end of physical memory (PHYSTOP). // The user program sits in the bottom of the address space, and the // kernel at the top at KERNBASE. static struct kmap { void *virt; uint phys_start; uint phys_end; int perm; } kmap[] = { { P2V(0), 0, 1024*1024, PTE_W}, // I/O space { (void*)KERNLINK, V2P(KERNLINK), V2P(data), 0}, // kernel text+rodata { data, V2P(data), PHYSTOP, PTE_W}, // kernel data, memory { (void*)DEVSPACE, DEVSPACE, 0, PTE_W}, // more devices }; // Set up kernel part of a page table. pde_t* setupkvm(char* (*alloc)(void)) { pde_t *pgdir; struct kmap *k; if((pgdir = (pde_t*)alloc()) == 0) return 0; memset(pgdir, 0, PGSIZE); if (p2v(PHYSTOP) > (void*)DEVSPACE) panic("PHYSTOP too high"); for(k = kmap; k < &kmap[NELEM(kmap)]; k++) if(mappages(pgdir, k−>virt, k−>phys_end − k−>phys_start, (uint)k−>phys_start, k−>perm, alloc) < 0) return 0; return pgdir; } Sheet 17 // Allocate one page table for the machine for the kernel address // space for scheduler processes. void kvmalloc(void) { kpgdir = setupkvm(enter_alloc); switchkvm(); } // Switch h/w page table register to the kernel−only page table, // for when no process is running. void switchkvm(void) { lcr3(v2p(kpgdir)); // switch to the kernel page table } // Switch TSS and h/w page table to correspond to process p. void switchuvm(struct proc *p) { pushcli(); cpu−>gdt[SEG_TSS] = SEG16(STS_T32A, &cpu−>ts, sizeof(cpu−>ts)−1, 0); cpu−>gdt[SEG_TSS].s = 0; cpu−>ts.ss0 = SEG_KDATA << 3; cpu−>ts.esp0 = (uint)proc−>kstack + KSTACKSIZE; ltr(SEG_TSS << 3); if(p−>pgdir == 0) panic("switchuvm: no pgdir"); lcr3(v2p(p−>pgdir)); // switch to new address space popcli(); } // Load the initcode into address 0 of pgdir. // sz must be less than a page. void inituvm(pde_t *pgdir, char *init, uint sz) { char *mem; if(sz >= PGSIZE) panic("inituvm: more than a page"); mem = kalloc(); memset(mem, 0, PGSIZE); mappages(pgdir, 0, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc); memmove(mem, init, sz); } Sheet 17 Sep 5 23:39 2011 xv6/vm.c Page 5 Sep 5 23:39 2011 xv6/vm.c Page 6 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 // Load a program segment into pgdir. addr must be page−aligned // and the pages from addr to addr+sz must already be mapped. int loaduvm(pde_t *pgdir, char *addr, struct inode *ip, uint offset, uint sz) { uint i, pa, n; pte_t *pte; if((uint) addr % PGSIZE != 0) panic("loaduvm: addr must be page aligned"); for(i = 0; i < sz; i += PGSIZE){ if((pte = walkpgdir(pgdir, addr+i, 0)) == 0) panic("loaduvm: address should exist"); pa = PTE_ADDR(*pte); if(sz − i < PGSIZE) n = sz − i; else n = PGSIZE; if(readi(ip, p2v(pa), offset+i, n) != n) return −1; } return 0; } // Allocate page tables and physical memory to grow process from oldsz to // newsz, which need not be page aligned. Returns new size or 0 on error. int allocuvm(pde_t *pgdir, uint oldsz, uint newsz) { char *mem; uint a; if(newsz return if(newsz return >= KERNBASE) 0; < oldsz) oldsz; a = PGROUNDUP(oldsz); for(; a < newsz; a += PGSIZE){ mem = kalloc(); if(mem == 0){ cprintf("allocuvm out of memory\n"); deallocuvm(pgdir, newsz, oldsz); return 0; } memset(mem, 0, PGSIZE); mappages(pgdir, (char*)a, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc); } return newsz; } Sheet 18 // Deallocate user pages to bring the process size from oldsz to // newsz. oldsz and newsz need not be page−aligned, nor does newsz // need to be less than oldsz. oldsz can be larger than the actual // process size. Returns the new process size. int deallocuvm(pde_t *pgdir, uint oldsz, uint newsz) { pte_t *pte; uint a, pa; if(newsz >= oldsz) return oldsz; a = PGROUNDUP(newsz); for(; a < oldsz; a += PGSIZE){ pte = walkpgdir(pgdir, (char*)a, 0); if(!pte) a += (NPTENTRIES − 1) * PGSIZE; else if((*pte & PTE_P) != 0){ pa = PTE_ADDR(*pte); if(pa == 0) panic("kfree"); char *v = p2v(pa); kfree(v); *pte = 0; } } return newsz; } // Free a page table and all the physical memory pages // in the user part. void freevm(pde_t *pgdir) { uint i; if(pgdir == 0) panic("freevm: no pgdir"); deallocuvm(pgdir, KERNBASE, 0); for(i = 0; i < NPDENTRIES; i++){ if(pgdir[i] & PTE_P){ char * v = p2v(PTE_ADDR(pgdir[i])); kfree(v); } } kfree((char*)pgdir); } Sheet 18 Sep 5 23:39 2011 xv6/vm.c Page 7 Sep 5 23:39 2011 xv6/vm.c Page 8 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 // Clear PTE_U on a page. Used to create an inaccessible // page beneath the user stack. void clearpteu(pde_t *pgdir, char *uva) { pte_t *pte; pte = walkpgdir(pgdir, uva, 0); if(pte == 0) panic("clearpteu"); *pte &= ~PTE_U; } // Given a parent process’s page table, create a copy // of it for a child. pde_t* copyuvm(pde_t *pgdir, uint sz) { pde_t *d; pte_t *pte; uint pa, i; char *mem; if((d = setupkvm(kalloc)) == 0) return 0; for(i = 0; i < sz; i += PGSIZE){ if((pte = walkpgdir(pgdir, (void *) i, 0)) == 0) panic("copyuvm: pte should exist"); if(!(*pte & PTE_P)) panic("copyuvm: page not present"); pa = PTE_ADDR(*pte); if((mem = kalloc()) == 0) goto bad; memmove(mem, (char*)p2v(pa), PGSIZE); if(mappages(d, (void*)i, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc) < 0) goto bad; } return d; bad: freevm(d); return 0; } Sheet 19 // Map user virtual address to kernel address. char* uva2ka(pde_t *pgdir, char *uva) { pte_t *pte; pte = walkpgdir(pgdir, uva, 0); if((*pte & PTE_P) == 0) return 0; if((*pte & PTE_U) == 0) return 0; return (char*)p2v(PTE_ADDR(*pte)); } // Copy len bytes from p to user address va in page table pgdir. // Most useful when pgdir is not the current page table. // uva2ka ensures this only works for PTE_U pages. int copyout(pde_t *pgdir, uint va, void *p, uint len) { char *buf, *pa0; uint n, va0; buf = (char*)p; while(len > 0){ va0 = (uint)PGROUNDDOWN(va); pa0 = uva2ka(pgdir, (char*)va0); if(pa0 == 0) return −1; n = PGSIZE − (va − va0); if(n > len) n = len; memmove(pa0 + (va − va0), buf, n); len −= n; buf += n; va = va0 + PGSIZE; } return 0; } Sheet 19 Sep 5 23:39 2011 xv6/proc.h Page 1 Sep 5 23:39 2011 xv6/proc.h Page 2 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 // Segments in proc−>gdt. #define NSEGS 7 // Per−CPU state struct cpu { uchar id; struct context *scheduler; struct taskstate ts; struct segdesc gdt[NSEGS]; volatile uint started; int ncli; int intena; // Local APIC ID; index into cpus[] below // swtch() here to enter scheduler // Used by x86 to find stack for interrupt // x86 global descriptor table // Has the CPU started? // Depth of pushcli nesting. // Were interrupts enabled before pushcli? // Cpu−local storage variables; see below struct cpu *cpu; struct proc *proc; // The currently−running process. }; extern struct cpu cpus[NCPU]; extern int ncpu; // Per−CPU variables, holding pointers to the // current cpu and to the current process. // The asm suffix tells gcc to use "%gs:0" to refer to cpu // and "%gs:4" to refer to proc. seginit sets up the // %gs segment register so that %gs refers to the memory // holding those two variables in the local cpu’s struct cpu. // This is similar to how thread−local variables are implemented // in thread libraries such as Linux pthreads. extern struct cpu *cpu asm("%gs:0"); // &cpus[cpunum()] extern struct proc *proc asm("%gs:4"); // cpus[cpunum()].proc // Saved registers for kernel context switches. // Don’t need to save all the segment registers (%cs, etc), // because they are constant across kernel contexts. // Don’t need to save %eax, %ecx, %edx, because the // x86 convention is that the caller has saved them. // Contexts are stored at the bottom of the stack they // describe; the stack pointer is the address of the context. // The layout of the context matches the layout of the stack in swtch.S // at the "Switch stacks" comment. Switch doesn’t save eip explicitly, // but it is on the stack and allocproc() manipulates it. struct context { uint edi; uint esi; uint ebx; uint ebp; uint eip; }; Sheet 20 enum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE }; // Per−process state struct proc { uint sz; pde_t* pgdir; char *kstack; enum procstate state; volatile int pid; struct proc *parent; struct trapframe *tf; struct context *context; void *chan; int killed; struct file *ofile[NOFILE]; struct inode *cwd; char name[16]; }; // // // // // // // // // // // // // Size of process memory (bytes) Page table Bottom of kernel stack for this process Process state Process ID Parent process Trap frame for current syscall swtch() here to run process If non−zero, sleeping on chan If non−zero, have been killed Open files Current directory Process name (debugging) // Process memory is laid out contiguously, low addresses first: // text // original data and bss // fixed−size stack // expandable heap Sheet 20 Sep 5 23:39 2011 xv6/proc.c Page 1 Sep 5 23:39 2011 xv6/proc.c Page 2 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "x86.h" "proc.h" "spinlock.h" struct { struct spinlock lock; struct proc proc[NPROC]; } ptable; static struct proc *initproc; int nextpid = 1; extern void forkret(void); extern void trapret(void); static void wakeup1(void *chan); void pinit(void) { initlock(&ptable.lock, "ptable"); } Sheet 21 // Look in the process table for an UNUSED proc. // If found, change state to EMBRYO and initialize // state required to run in the kernel. // Otherwise return 0. static struct proc* allocproc(void) { struct proc *p; char *sp; acquire(&ptable.lock); for(p = ptable.proc; p < &ptable.proc[NPROC]; p++) if(p−>state == UNUSED) goto found; release(&ptable.lock); return 0; found: p−>state = EMBRYO; p−>pid = nextpid++; release(&ptable.lock); // Allocate kernel stack. if((p−>kstack = kalloc()) == 0){ p−>state = UNUSED; return 0; } sp = p−>kstack + KSTACKSIZE; // Leave room for trap frame. sp −= sizeof *p−>tf; p−>tf = (struct trapframe*)sp; // Set up new context to start executing at forkret, // which returns to trapret. sp −= 4; *(uint*)sp = (uint)trapret; sp −= sizeof *p−>context; p−>context = (struct context*)sp; memset(p−>context, 0, sizeof *p−>context); p−>context−>eip = (uint)forkret; return p; } Sheet 21 Sep 5 23:39 2011 xv6/proc.c Page 3 Sep 5 23:39 2011 xv6/proc.c Page 4 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 // Set up first user process. void userinit(void) { struct proc *p; extern char _binary_initcode_start[], _binary_initcode_size[]; p = allocproc(); initproc = p; if((p−>pgdir = setupkvm(kalloc)) == 0) panic("userinit: out of memory?"); inituvm(p−>pgdir, _binary_initcode_start, (int)_binary_initcode_size); p−>sz = PGSIZE; memset(p−>tf, 0, sizeof(*p−>tf)); p−>tf−>cs = (SEG_UCODE << 3) | DPL_USER; p−>tf−>ds = (SEG_UDATA << 3) | DPL_USER; p−>tf−>es = p−>tf−>ds; p−>tf−>ss = p−>tf−>ds; p−>tf−>eflags = FL_IF; p−>tf−>esp = PGSIZE; p−>tf−>eip = 0; // beginning of initcode.S safestrcpy(p−>name, "initcode", sizeof(p−>name)); p−>cwd = namei("/"); p−>state = RUNNABLE; } // Grow current process’s memory by n bytes. // Return 0 on success, −1 on failure. int growproc(int n) { uint sz; sz = proc−>sz; if(n > 0){ if((sz = allocuvm(proc−>pgdir, sz, sz + n)) == 0) return −1; } else if(n < 0){ if((sz = deallocuvm(proc−>pgdir, sz, sz + n)) == 0) return −1; } proc−>sz = sz; switchuvm(proc); return 0; } Sheet 22 // Create a new process copying p as the parent. // Sets up stack to return as if from system call. // Caller must set state of returned proc to RUNNABLE. int fork(void) { int i, pid; struct proc *np; // Allocate process. if((np = allocproc()) == 0) return −1; // Copy process state from p. if((np−>pgdir = copyuvm(proc−>pgdir, proc−>sz)) == 0){ kfree(np−>kstack); np−>kstack = 0; np−>state = UNUSED; return −1; } np−>sz = proc−>sz; np−>parent = proc; *np−>tf = *proc−>tf; // Clear %eax so that fork returns 0 in the child. np−>tf−>eax = 0; for(i = 0; i < NOFILE; i++) if(proc−>ofile[i]) np−>ofile[i] = filedup(proc−>ofile[i]); np−>cwd = idup(proc−>cwd); pid = np−>pid; np−>state = RUNNABLE; safestrcpy(np−>name, proc−>name, sizeof(proc−>name)); return pid; } Sheet 22 Sep 5 23:39 2011 xv6/proc.c Page 5 Sep 5 23:39 2011 xv6/proc.c Page 6 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 // Exit the current process. Does not return. // An exited process remains in the zombie state // until its parent calls wait() to find out it exited. void exit(void) { struct proc *p; int fd; if(proc == initproc) panic("init exiting"); // Close all open files. for(fd = 0; fd < NOFILE; fd++){ if(proc−>ofile[fd]){ fileclose(proc−>ofile[fd]); proc−>ofile[fd] = 0; } } iput(proc−>cwd); proc−>cwd = 0; acquire(&ptable.lock); // Parent might be sleeping in wait(). wakeup1(proc−>parent); // Pass abandoned children to init. for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p−>parent == proc){ p−>parent = initproc; if(p−>state == ZOMBIE) wakeup1(initproc); } } // Jump into the scheduler, never to return. proc−>state = ZOMBIE; sched(); panic("zombie exit"); } Sheet 23 // Wait for a child process to exit and return its pid. // Return −1 if this process has no children. int wait(void) { struct proc *p; int havekids, pid; acquire(&ptable.lock); for(;;){ // Scan through table looking for zombie children. havekids = 0; for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p−>parent != proc) continue; havekids = 1; if(p−>state == ZOMBIE){ // Found one. pid = p−>pid; kfree(p−>kstack); p−>kstack = 0; freevm(p−>pgdir); p−>state = UNUSED; p−>pid = 0; p−>parent = 0; p−>name[0] = 0; p−>killed = 0; release(&ptable.lock); return pid; } } // No point waiting if we don’t have any children. if(!havekids || proc−>killed){ release(&ptable.lock); return −1; } // Wait for children to exit. (See wakeup1 call in proc_exit.) sleep(proc, &ptable.lock); } } Sheet 23 Sep 5 23:39 2011 xv6/proc.c Page 7 Sep 5 23:39 2011 xv6/proc.c Page 8 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 // Per−CPU process scheduler. // Each CPU calls scheduler() after setting itself up. // Scheduler never returns. It loops, doing: // − choose a process to run // − swtch to start running that process // − eventually that process transfers control // via swtch back to the scheduler. void scheduler(void) { struct proc *p; for(;;){ // Enable interrupts on this processor. sti(); // Loop over process table looking for process to run. acquire(&ptable.lock); for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p−>state != RUNNABLE) continue; // Switch to chosen process. It is the process’s job // to release ptable.lock and then reacquire it // before jumping back to us. proc = p; switchuvm(p); p−>state = RUNNING; swtch(&cpu−>scheduler, proc−>context); switchkvm(); // Process is done running for now. // It should have changed its p−>state before coming back. proc = 0; } release(&ptable.lock); } } Sheet 24 // Enter scheduler. Must hold only ptable.lock // and have changed proc−>state. void sched(void) { int intena; if(!holding(&ptable.lock)) panic("sched ptable.lock"); if(cpu−>ncli != 1) panic("sched locks"); if(proc−>state == RUNNING) panic("sched running"); if(readeflags()&FL_IF) panic("sched interruptible"); intena = cpu−>intena; swtch(&proc−>context, cpu−>scheduler); cpu−>intena = intena; } // Give up the CPU for one scheduling round. void yield(void) { acquire(&ptable.lock); proc−>state = RUNNABLE; sched(); release(&ptable.lock); } // A fork child’s very first scheduling by scheduler() // will swtch here. "Return" to user space. void forkret(void) { static int first = 1; // Still holding ptable.lock from scheduler. release(&ptable.lock); if (first) { // Some initialization functions must be run in the context // of a regular process (e.g., they call sleep), and thus cannot // be run from main(). first = 0; initlog(); } // Return to "caller", actually trapret (see allocproc). } Sheet 24 Sep 5 23:39 2011 xv6/proc.c Page 9 Sep 5 23:39 2011 xv6/proc.c Page 10 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 // Atomically release lock and sleep on chan. // Reacquires lock when awakened. void sleep(void *chan, struct spinlock *lk) { if(proc == 0) panic("sleep"); if(lk == 0) panic("sleep without lk"); // Must acquire ptable.lock in order to // change p−>state and then call sched. // Once we hold ptable.lock, we can be // guaranteed that we won’t miss any wakeup // (wakeup runs with ptable.lock locked), // so it’s okay to release lk. if(lk != &ptable.lock){ acquire(&ptable.lock); release(lk); } // Go to sleep. proc−>chan = chan; proc−>state = SLEEPING; sched(); // Tidy up. proc−>chan = 0; // Reacquire original lock. if(lk != &ptable.lock){ release(&ptable.lock); acquire(lk); } } Sheet 25 // Wake up all processes sleeping on chan. // The ptable lock must be held. static void wakeup1(void *chan) { struct proc *p; for(p = ptable.proc; p < &ptable.proc[NPROC]; p++) if(p−>state == SLEEPING && p−>chan == chan) p−>state = RUNNABLE; } // Wake up all processes sleeping on chan. void wakeup(void *chan) { acquire(&ptable.lock); wakeup1(chan); release(&ptable.lock); } // Kill the process with the given pid. // Process won’t exit until it returns // to user space (see trap in trap.c). int kill(int pid) { struct proc *p; acquire(&ptable.lock); for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p−>pid == pid){ p−>killed = 1; // Wake process from sleep if necessary. if(p−>state == SLEEPING) p−>state = RUNNABLE; release(&ptable.lock); return 0; } } release(&ptable.lock); return −1; } Sheet 25 Sep 5 23:39 2011 xv6/proc.c Page 11 Sep 5 23:39 2011 xv6/swtch.S Page 1 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 // Print a process listing to console. For debugging. // Runs when user types ^P on console. // No lock to avoid wedging a stuck machine further. void procdump(void) { static char *states[] = { [UNUSED] "unused", [EMBRYO] "embryo", [SLEEPING] "sleep ", [RUNNABLE] "runble", [RUNNING] "run ", [ZOMBIE] "zombie" }; int i; struct proc *p; char *state; uint pc[10]; for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p−>state == UNUSED) continue; if(p−>state >= 0 && p−>state < NELEM(states) && states[p−>state]) state = states[p−>state]; else state = "???"; cprintf("%d %s %s", p−>pid, state, p−>name); if(p−>state == SLEEPING){ getcallerpcs((uint*)p−>context−>ebp+2, pc); for(i=0; i<10 && pc[i] != 0; i++) cprintf(" %p", pc[i]); } cprintf("\n"); } } Sheet 26 # Context switch # # void swtch(struct context **old, struct context *new); # # Save current register context in old # and then load register context from new. .globl swtch swtch: movl 4(%esp), %eax movl 8(%esp), %edx # Save old callee−save registers pushl %ebp pushl %ebx pushl %esi pushl %edi # Switch stacks movl %esp, (%eax) movl %edx, %esp # Load new callee−save registers popl %edi popl %esi popl %ebx popl %ebp ret Sheet 26 Sep 5 23:39 2011 xv6/kalloc.c Page 1 Sep 5 23:39 2011 xv6/kalloc.c Page 2 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 // Physical memory allocator, intended to allocate // memory for user processes, kernel stacks, page table pages, // and pipe buffers. Allocates 4096−byte pages. #include #include #include #include #include #include "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "spinlock.h" struct run { struct run *next; }; struct { struct spinlock lock; struct run *freelist; } kmem; extern char end[]; // first address after kernel loaded from ELF file static char *newend; // A simple page allocator to get off the ground during entry char * enter_alloc(void) { if (newend == 0) newend = end; if ((uint) newend >= KERNBASE + 0x400000) panic("only first 4Mbyte are mapped during entry"); void *p = (void*)PGROUNDUP((uint)newend); memset(p, 0, PGSIZE); newend = newend + PGSIZE; return p; } // Initialize free list of physical pages. void kinit(void) { char *p; initlock(&kmem.lock, "kmem"); p = (char*)PGROUNDUP((uint)newend); for(; p + PGSIZE <= (char*)p2v(PHYSTOP); p += PGSIZE) kfree(p); } Sheet 27 // Free the page of physical memory pointed at by v, // which normally should have been returned by a // call to kalloc(). (The exception is when // initializing the allocator; see kinit above.) void kfree(char *v) { struct run *r; if((uint)v % PGSIZE || v < end || v2p(v) >= PHYSTOP) panic("kfree"); // Fill with junk to catch dangling refs. memset(v, 1, PGSIZE); acquire(&kmem.lock); r = (struct run*)v; r−>next = kmem.freelist; kmem.freelist = r; release(&kmem.lock); } // Allocate one 4096−byte page of physical memory. // Returns a pointer that the kernel can use. // Returns 0 if the memory cannot be allocated. char* kalloc(void) { struct run *r; acquire(&kmem.lock); r = kmem.freelist; if(r) kmem.freelist = r−>next; release(&kmem.lock); return (char*)r; } Sheet 27 Sep 5 23:39 2011 xv6/traps.h Page 1 Sep 5 23:39 2011 xv6/vectors.pl Page 1 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 // x86 trap and interrupt constants. // Processor−defined: #define T_DIVIDE #define T_DEBUG #define T_NMI #define T_BRKPT #define T_OFLOW #define T_BOUND #define T_ILLOP #define T_DEVICE #define T_DBLFLT // #define T_COPROC #define T_TSS #define T_SEGNP #define T_STACK #define T_GPFLT #define T_PGFLT // #define T_RES #define T_FPERR #define T_ALIGN #define T_MCHK #define T_SIMDERR 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 // // // // // // // // // // // // // // // // // // // // divide error debug exception non−maskable interrupt breakpoint overflow bounds check illegal opcode device not available double fault reserved (not used since 486) invalid task switch segment segment not present stack exception general protection fault page fault reserved floating point error aligment check machine check SIMD floating point error // These are arbitrarily chosen, but with care not to overlap // processor defined exceptions or interrupt vectors. #define T_SYSCALL 64 // system call #define T_DEFAULT 500 // catchall #define T_IRQ0 32 #define #define #define #define #define #define 0 1 4 14 19 31 Sheet 28 IRQ_TIMER IRQ_KBD IRQ_COM1 IRQ_IDE IRQ_ERROR IRQ_SPURIOUS // IRQ 0 corresponds to int T_IRQ #!/usr/bin/perl −w # # # # Generate vectors.S, the trap/interrupt entry points. There has to be one entry point per interrupt number since otherwise there’s no way for trap() to discover the interrupt number. print "# generated by vectors.pl − do not edit\n"; print "# handlers\n"; print ".globl alltraps\n"; for(my $i = 0; $i < 256; $i++){ print ".globl vector$i\n"; print "vector$i:\n"; if(!($i == 8 || ($i >= 10 && $i <= 14) || $i == 17)){ print " pushl \$0\n"; } print " pushl \$$i\n"; print " jmp alltraps\n"; } print "\n# vector table\n"; print ".data\n"; print ".globl vectors\n"; print "vectors:\n"; for(my $i = 0; $i < 256; $i++){ print " .long vector$i\n"; } # sample output: # # handlers # .globl alltraps # .globl vector0 # vector0: # pushl $0 # pushl $0 # jmp alltraps # ... # # # vector table # .data # .globl vectors # vectors: # .long vector0 # .long vector1 # .long vector2 # ... Sheet 28 Sep 5 23:39 2011 xv6/trapasm.S Page 1 Sep 5 23:39 2011 xv6/trap.c Page 1 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 #include "mmu.h" # vectors.S sends all traps here. .globl alltraps alltraps: # Build trap frame. pushl %ds pushl %es pushl %fs pushl %gs pushal # Set up data and per−cpu segments. movw $(SEG_KDATA<<3), %ax movw %ax, %ds movw %ax, %es movw $(SEG_KCPU<<3), %ax movw %ax, %fs movw %ax, %gs # Call trap(tf), where tf=%esp pushl %esp call trap addl $4, %esp # Return falls through to trapret... .globl trapret trapret: popal popl %gs popl %fs popl %es popl %ds addl $0x8, %esp # trapno and errcode iret Sheet 29 #include #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "proc.h" "x86.h" "traps.h" "spinlock.h" // Interrupt descriptor table (shared by all CPUs). struct gatedesc idt[256]; extern uint vectors[]; // in vectors.S: array of 256 entry pointers struct spinlock tickslock; uint ticks; void tvinit(void) { int i; for(i = 0; i < 256; i++) SETGATE(idt[i], 0, SEG_KCODE<<3, vectors[i], 0); SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); initlock(&tickslock, "time"); } void idtinit(void) { lidt(idt, sizeof(idt)); } Sheet 29 Sep 5 23:39 2011 xv6/trap.c Page 2 Sep 5 23:39 2011 xv6/trap.c Page 3 3000 void 3001 trap(struct trapframe *tf) 3002 { 3003 if(tf−>trapno == T_SYSCALL){ 3004 if(proc−>killed) 3005 exit(); 3006 proc−>tf = tf; 3007 syscall(); 3008 if(proc−>killed) 3009 exit(); 3010 return; 3011 } 3012 3013 switch(tf−>trapno){ 3014 case T_IRQ0 + IRQ_TIMER: 3015 if(cpu−>id == 0){ 3016 acquire(&tickslock); 3017 ticks++; 3018 wakeup(&ticks); 3019 release(&tickslock); 3020 } 3021 lapiceoi(); 3022 break; 3023 case T_IRQ0 + IRQ_IDE: 3024 ideintr(); 3025 lapiceoi(); 3026 break; 3027 case T_IRQ0 + IRQ_IDE+1: 3028 // Bochs generates spurious IDE1 interrupts. 3029 break; 3030 case T_IRQ0 + IRQ_KBD: 3031 kbdintr(); 3032 lapiceoi(); 3033 break; 3034 case T_IRQ0 + IRQ_COM1: 3035 uartintr(); 3036 lapiceoi(); 3037 break; 3038 case T_IRQ0 + 7: 3039 case T_IRQ0 + IRQ_SPURIOUS: 3040 cprintf("cpu%d: spurious interrupt at %x:%x\n", 3041 cpu−>id, tf−>cs, tf−>eip); 3042 lapiceoi(); 3043 break; 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 } 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 Sheet 30 Sheet 30 default: if(proc == 0 || (tf−>cs&3) == 0){ // In kernel, it must be our mistake. cprintf("unexpected trap %d from cpu %d eip %x (cr2=0x%x)\n", tf−>trapno, cpu−>id, tf−>eip, rcr2()); panic("trap"); } // In user space, assume process misbehaved. cprintf("pid %d %s: trap %d err %d on cpu %d " "eip 0x%x addr 0x%x−−kill proc\n", proc−>pid, proc−>name, tf−>trapno, tf−>err, cpu−>id, tf−>eip, rcr2()); proc−>killed = 1; } // Force process exit if it has been killed and is in user space. // (If it is still executing in the kernel, let it keep running // until it gets to the regular system call return.) if(proc && proc−>killed && (tf−>cs&3) == DPL_USER) exit(); // Force process to give up CPU on clock tick. // If interrupts were on while locks held, would need to check nlock. if(proc && proc−>state == RUNNING && tf−>trapno == T_IRQ0+IRQ_TIMER) yield(); // Check if the process has been killed since we yielded if(proc && proc−>killed && (tf−>cs&3) == DPL_USER) exit(); Sep 5 23:39 2011 xv6/syscall.h Page 1 Sep 5 23:39 2011 xv6/syscall.c Page 1 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 // System call numbers #define SYS_fork 1 #define SYS_exit 2 #define SYS_wait 3 #define SYS_pipe 4 #define SYS_read 5 #define SYS_kill 6 #define SYS_exec 7 #define SYS_fstat 8 #define SYS_chdir 9 #define SYS_dup 10 #define SYS_getpid 11 #define SYS_sbrk 12 #define SYS_sleep 13 #define SYS_uptime 14 #define #define #define #define #define #define #define Sheet 31 SYS_open SYS_write SYS_mknod SYS_unlink SYS_link SYS_mkdir SYS_close 15 16 17 18 19 20 21 #include #include #include #include #include #include #include #include // // // // // "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "proc.h" "x86.h" "syscall.h" User code makes a system call with INT T_SYSCALL. System call number in %eax. Arguments on the stack, from the user call to the C library system call function. The saved user %esp points to a saved program counter, and then the first argument. // Fetch the int at addr from process p. int fetchint(struct proc *p, uint addr, int *ip) { if(addr >= p−>sz || addr+4 > p−>sz) return −1; *ip = *(int*)(addr); return 0; } // Fetch the nul−terminated string at addr from process p. // Doesn’t actually copy the string − just sets *pp to point at it. // Returns length of string, not including nul. int fetchstr(struct proc *p, uint addr, char **pp) { char *s, *ep; if(addr >= p−>sz) return −1; *pp = (char*)addr; ep = (char*)p−>sz; for(s = *pp; s < ep; s++) if(*s == 0) return s − *pp; return −1; } // Fetch the nth 32−bit system call argument. int argint(int n, int *ip) { return fetchint(proc, proc−>tf−>esp + 4 + 4*n, ip); } Sheet 31 Sep 5 23:39 2011 xv6/syscall.c Page 2 Sep 5 23:39 2011 xv6/syscall.c Page 3 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266 3267 3268 3269 3270 3271 3272 3273 3274 3275 3276 3277 3278 3279 3280 3281 3282 3283 3284 3285 3286 3287 3288 3289 3290 3291 3292 3293 3294 3295 3296 3297 3298 3299 // Fetch the nth word−sized system call argument as a pointer // to a block of memory of size n bytes. Check that the pointer // lies within the process address space. int argptr(int n, char **pp, int size) { int i; if(argint(n, &i) < 0) return −1; if((uint)i >= proc−>sz || (uint)i+size > proc−>sz) return −1; *pp = (char*)i; return 0; } // Fetch the nth word−sized system call argument as a string pointer. // Check that the pointer is valid and the string is nul−terminated. // (There is no shared writable memory, so the string can’t change // between this check and being used by the kernel.) int argstr(int n, char **pp) { int addr; if(argint(n, &addr) < 0) return −1; return fetchstr(proc, addr, pp); } extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern Sheet 32 int int int int int int int int int int int int int int int int int int int int int sys_chdir(void); sys_close(void); sys_dup(void); sys_exec(void); sys_exit(void); sys_fork(void); sys_fstat(void); sys_getpid(void); sys_kill(void); sys_link(void); sys_mkdir(void); sys_mknod(void); sys_open(void); sys_pipe(void); sys_read(void); sys_sbrk(void); sys_sleep(void); sys_unlink(void); sys_wait(void); sys_write(void); sys_uptime(void); static int (*syscalls[])(void) = { [SYS_fork] sys_fork, [SYS_exit] sys_exit, [SYS_wait] sys_wait, [SYS_pipe] sys_pipe, [SYS_read] sys_read, [SYS_kill] sys_kill, [SYS_exec] sys_exec, [SYS_fstat] sys_fstat, [SYS_chdir] sys_chdir, [SYS_dup] sys_dup, [SYS_getpid] sys_getpid, [SYS_sbrk] sys_sbrk, [SYS_sleep] sys_sleep, [SYS_uptime] sys_uptime, [SYS_open] sys_open, [SYS_write] sys_write, [SYS_mknod] sys_mknod, [SYS_unlink] sys_unlink, [SYS_link] sys_link, [SYS_mkdir] sys_mkdir, [SYS_close] sys_close, }; void syscall(void) { int num; num = proc−>tf−>eax; if(num >= 0 && num < SYS_open && syscalls[num]) { proc−>tf−>eax = syscalls[num](); } else if (num >= SYS_open && num < NELEM(syscalls) && syscalls[num]) { proc−>tf−>eax = syscalls[num](); } else { cprintf("%d %s: unknown sys call %d\n", proc−>pid, proc−>name, num); proc−>tf−>eax = −1; } } Sheet 32 Sep 5 23:39 2011 xv6/sysproc.c Page 1 Sep 5 23:39 2011 xv6/sysproc.c Page 2 3300 3301 3302 3303 3304 3305 3306 3307 3308 3309 3310 3311 3312 3313 3314 3315 3316 3317 3318 3319 3320 3321 3322 3323 3324 3325 3326 3327 3328 3329 3330 3331 3332 3333 3334 3335 3336 3337 3338 3339 3340 3341 3342 3343 3344 3345 3346 3347 3348 3349 3350 3351 3352 3353 3354 3355 3356 3357 3358 3359 3360 3361 3362 3363 3364 3365 3366 3367 3368 3369 3370 3371 3372 3373 3374 3375 3376 3377 3378 3379 3380 3381 3382 3383 3384 3385 3386 3387 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3398 3399 #include #include #include #include #include #include #include "types.h" "x86.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "proc.h" int sys_fork(void) { return fork(); } int sys_exit(void) { exit(); return 0; // not reached } int sys_wait(void) { return wait(); } int sys_kill(void) { int pid; if(argint(0, &pid) < 0) return −1; return kill(pid); } int sys_getpid(void) { return proc−>pid; } Sheet 33 int sys_sbrk(void) { int addr; int n; if(argint(0, &n) < 0) return −1; addr = proc−>sz; if(growproc(n) < 0) return −1; return addr; } int sys_sleep(void) { int n; uint ticks0; if(argint(0, &n) < 0) return −1; acquire(&tickslock); ticks0 = ticks; while(ticks − ticks0 < n){ if(proc−>killed){ release(&tickslock); return −1; } sleep(&ticks, &tickslock); } release(&tickslock); return 0; } // return how many clock tick interrupts have occurred // since start. int sys_uptime(void) { uint xticks; acquire(&tickslock); xticks = ticks; release(&tickslock); return xticks; } Sheet 33 Sep 5 23:39 2011 xv6/buf.h Page 1 Sep 5 23:39 2011 xv6/fcntl.h Page 1 3400 3401 3402 3403 3404 3405 3406 3407 3408 3409 3410 3411 3412 3413 3414 3415 3416 3417 3418 3419 3420 3421 3422 3423 3424 3425 3426 3427 3428 3429 3430 3431 3432 3433 3434 3435 3436 3437 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 3490 3491 3492 3493 3494 3495 3496 3497 3498 3499 struct buf { int flags; uint dev; uint sector; struct buf *prev; // LRU cache list struct buf *next; struct buf *qnext; // disk queue uchar data[512]; }; #define B_BUSY 0x1 // buffer is locked by some process #define B_VALID 0x2 // buffer has been read from disk #define B_DIRTY 0x4 // buffer needs to be written to disk Sheet 34 #define #define #define #define Sheet 34 O_RDONLY O_WRONLY O_RDWR O_CREATE 0x000 0x001 0x002 0x200 Sep 5 23:39 2011 xv6/stat.h Page 1 Sep 5 23:39 2011 xv6/fs.h Page 1 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 3510 3511 3512 3513 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 3529 3530 3531 3532 3533 3534 3535 3536 3537 3538 3539 3540 3541 3542 3543 3544 3545 3546 3547 3548 3549 3550 3551 3552 3553 3554 3555 3556 3557 3558 3559 3560 3561 3562 3563 3564 3565 3566 3567 3568 3569 3570 3571 3572 3573 3574 3575 3576 3577 3578 3579 3580 3581 3582 3583 3584 3585 3586 3587 3588 3589 3590 3591 3592 3593 3594 3595 3596 3597 3598 3599 #define T_DIR 1 #define T_FILE 2 #define T_DEV 3 struct stat { short type; int dev; uint ino; short nlink; uint size; }; Sheet 35 // // // // // // Directory // File // Special device Type of file Device number Inode number on device Number of links to file Size of file in bytes // On−disk file system format. // Both the kernel and user programs use this header file. // Block 0 is unused. Block 1 is super block. // Inodes start at block 2. #define ROOTINO 1 // root i−number #define BSIZE 512 // block size // File system super struct superblock { uint size; uint nblocks; uint ninodes; uint nlog; }; block // // // // Size of file system image (blocks) Number of data blocks Number of inodes. Number of log blocks #define NDIRECT 12 #define NINDIRECT (BSIZE / sizeof(uint)) #define MAXFILE (NDIRECT + NINDIRECT) // On−disk inode structure struct dinode { short type; // short major; // short minor; // short nlink; // uint size; // uint addrs[NDIRECT+1]; }; File type Major device number (T_DEV only) Minor device number (T_DEV only) Number of links to inode in file system Size of file (bytes) // Data block addresses // Inodes per block. #define IPB (BSIZE / sizeof(struct dinode)) // Block containing inode i #define IBLOCK(i) ((i) / IPB + 2) // Bitmap bits per block #define BPB (BSIZE*8) // Block containing bit for block b #define BBLOCK(b, ninodes) (b/BPB + (ninodes)/IPB + 3) // Directory is a file containing a sequence of dirent structures. #define DIRSIZ 14 struct dirent { ushort inum; char name[DIRSIZ]; }; Sheet 35 Sep 5 23:39 2011 xv6/file.h Page 1 Sep 5 23:39 2011 xv6/ide.c Page 1 3600 3601 3602 3603 3604 3605 3606 3607 3608 3609 3610 3611 3612 3613 3614 3615 3616 3617 3618 3619 3620 3621 3622 3623 3624 3625 3626 3627 3628 3629 3630 3631 3632 3633 3634 3635 3636 3637 3638 3639 3640 3641 3642 3643 3644 3645 3646 3647 3648 3649 3650 3651 3652 3653 3654 3655 3656 3657 3658 3659 3660 3661 3662 3663 3664 3665 3666 3667 3668 3669 3670 3671 3672 3673 3674 3675 3676 3677 3678 3679 3680 3681 3682 3683 3684 3685 3686 3687 3688 3689 3690 3691 3692 3693 3694 3695 3696 3697 3698 3699 struct file { enum { FD_NONE, FD_PIPE, FD_INODE } type; int ref; // reference count char readable; char writable; struct pipe *pipe; struct inode *ip; uint off; }; // in−core file system types struct inode { uint dev; uint inum; int ref; int flags; // // // // Device number Inode number Reference count I_BUSY, I_VALID short type; // copy of disk inode short major; short minor; short nlink; uint size; uint addrs[NDIRECT+1]; }; #define I_BUSY 0x1 #define I_VALID 0x2 // device implementations struct devsw { int (*read)(struct inode*, char*, int); int (*write)(struct inode*, char*, int); }; extern struct devsw devsw[]; #define CONSOLE 1 Sheet 36 // Simple PIO−based (non−DMA) IDE driver code. #include #include #include #include #include #include #include #include #include #include #define #define #define #define "types.h" "defs.h" "param.h" "memlayout.h" "mmu.h" "proc.h" "x86.h" "traps.h" "spinlock.h" "buf.h" IDE_BSY IDE_DRDY IDE_DF IDE_ERR 0x80 0x40 0x20 0x01 #define IDE_CMD_READ 0x20 #define IDE_CMD_WRITE 0x30 // idequeue points to the buf now being read/written to the disk. // idequeue−>qnext points to the next buf to be processed. // You must hold idelock while manipulating queue. static struct spinlock idelock; static struct buf *idequeue; static int havedisk1; static void idestart(struct buf*); // Wait for IDE disk to become ready. static int idewait(int checkerr) { int r; while(((r = inb(0x1f7)) & (IDE_BSY|IDE_DRDY)) != IDE_DRDY) ; if(checkerr && (r & (IDE_DF|IDE_ERR)) != 0) return −1; return 0; } Sheet 36 Sep 5 23:39 2011 xv6/ide.c Page 2 Sep 5 23:39 2011 xv6/ide.c Page 3 3700 3701 3702 3703 3704 3705 3706 3707 3708 3709 3710 3711 3712 3713 3714 3715 3716 3717 3718 3719 3720 3721 3722 3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 3765 3766 3767 3768 3769 3770 3771 3772 3773 3774 3775 3776 3777 3778 3779 3780 3781 3782 3783 3784 3785 3786 3787 3788 3789 3790 3791 3792 3793 3794 3795 3796 3797 3798 3799 void ideinit(void) { int i; initlock(&idelock, "ide"); picenable(IRQ_IDE); ioapicenable(IRQ_IDE, ncpu − 1); idewait(0); // Check if disk 1 is present outb(0x1f6, 0xe0 | (1<<4)); for(i=0; i<1000; i++){ if(inb(0x1f7) != 0){ havedisk1 = 1; break; } } // Switch back to disk 0. outb(0x1f6, 0xe0 | (0<<4)); } // Start the request for b. Caller must hold idelock. static void idestart(struct buf *b) { if(b == 0) panic("idestart"); idewait(0); outb(0x3f6, 0); // generate interrupt outb(0x1f2, 1); // number of sectors outb(0x1f3, b−>sector & 0xff); outb(0x1f4, (b−>sector >> 8) & 0xff); outb(0x1f5, (b−>sector >> 16) & 0xff); outb(0x1f6, 0xe0 | ((b−>dev&1)<<4) | ((b−>sector>>24)&0x0f)); if(b−>flags & B_DIRTY){ outb(0x1f7, IDE_CMD_WRITE); outsl(0x1f0, b−>data, 512/4); } else { outb(0x1f7, IDE_CMD_READ); } } Sheet 37 // Interrupt handler. void ideintr(void) { struct buf *b; // Take first buffer off queue. acquire(&idelock); if((b = idequeue) == 0){ release(&idelock); // cprintf("spurious IDE interrupt\n"); return; } idequeue = b−>qnext; // Read data if needed. if(!(b−>flags & B_DIRTY) && idewait(1) >= 0) insl(0x1f0, b−>data, 512/4); // Wake process waiting for this buf. b−>flags |= B_VALID; b−>flags &= ~B_DIRTY; wakeup(b); // Start disk on next buf in queue. if(idequeue != 0) idestart(idequeue); release(&idelock); } Sheet 37 Sep 5 23:39 2011 xv6/ide.c Page 4 Sep 5 23:39 2011 xv6/bio.c Page 1 3800 3801 3802 3803 3804 3805 3806 3807 3808 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 3838 3839 3840 3841 3842 3843 3844 3845 3846 3847 3848 3849 3850 3851 3852 3853 3854 3855 3856 3857 3858 3859 3860 3861 3862 3863 3864 3865 3866 3867 3868 3869 3870 3871 3872 3873 3874 3875 3876 3877 3878 3879 3880 3881 3882 3883 3884 3885 3886 3887 3888 3889 3890 3891 3892 3893 3894 3895 3896 3897 3898 3899 // Sync buf with disk. // If B_DIRTY is set, write buf to disk, clear B_DIRTY, set B_VALID. // Else if B_VALID is not set, read buf from disk, set B_VALID. void iderw(struct buf *b) { struct buf **pp; if(!(b−>flags & B_BUSY)) panic("iderw: buf not busy"); if((b−>flags & (B_VALID|B_DIRTY)) == B_VALID) panic("iderw: nothing to do"); if(b−>dev != 0 && !havedisk1) panic("iderw: ide disk 1 not present"); acquire(&idelock); // DOC:acquire−lock // Append b to idequeue. b−>qnext = 0; for(pp=&idequeue; *pp; pp=&(*pp)−>qnext) // DOC:insert−queue ; *pp = b; // Start disk if necessary. if(idequeue == b) idestart(b); // Wait for request to finish. // Assuming will not sleep too long: ignore proc−>killed. while((b−>flags & (B_VALID|B_DIRTY)) != B_VALID){ sleep(b, &idelock); } release(&idelock); } Sheet 38 // // // // // // // // // // // // // // // // // // // // // // Buffer cache. The buffer cache is a linked list of buf structures holding cached copies of disk block contents. Caching disk blocks in memory reduces the number of disk reads and also provides a synchronization point for disk blocks used by multiple processes. Interface: * To get a buffer for a particular disk block, call bread. * After changing buffer data, call bwrite to flush it to disk. * When done with the buffer, call brelse. * Do not use the buffer after calling brelse. * Only one process at a time can use a buffer, so do not keep them longer than necessary. The implementation uses three state flags internally: * B_BUSY: the block has been returned from bread and has not been passed back to brelse. * B_VALID: the buffer data has been initialized with the associated disk block contents. * B_DIRTY: the buffer data has been modified and needs to be written to disk. #include #include #include #include #include "types.h" "defs.h" "param.h" "spinlock.h" "buf.h" struct { struct spinlock lock; struct buf buf[NBUF]; // Linked list of all buffers, through prev/next. // head.next is most recently used. struct buf head; } bcache; void binit(void) { struct buf *b; initlock(&bcache.lock, "bcache"); Sheet 38 Sep 5 23:39 2011 xv6/bio.c Page 2 Sep 5 23:39 2011 xv6/bio.c Page 3 3900 3901 3902 3903 3904 3905 3906 3907 3908 3909 3910 3911 3912 3913 3914 3915 3916 3917 3918 3919 3920 3921 3922 3923 3924 3925 3926 3927 3928 3929 3930 3931 3932 3933 3934 3935 3936 3937 3938 3939 3940 3941 3942 3943 3944 3945 3946 3947 3948 3949 3950 3951 3952 3953 3954 3955 3956 3957 3958 3959 3960 3961 3962 3963 3964 3965 3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 // Create linked list of buffers bcache.head.prev = &bcache.head; bcache.head.next = &bcache.head; for(b = bcache.buf; b < bcache.buf+NBUF; b++){ b−>next = bcache.head.next; b−>prev = &bcache.head; b−>dev = −1; bcache.head.next−>prev = b; bcache.head.next = b; } } // Look through buffer cache for sector on device dev. // If not found, allocate fresh block. // In either case, return locked buffer. static struct buf* bget(uint dev, uint sector) { struct buf *b; acquire(&bcache.lock); loop: // Try for cached block. for(b = bcache.head.next; b != &bcache.head; b = b−>next){ if(b−>dev == dev && b−>sector == sector){ if(!(b−>flags & B_BUSY)){ b−>flags |= B_BUSY; release(&bcache.lock); return b; } sleep(b, &bcache.lock); goto loop; } } // Allocate fresh block. for(b = bcache.head.prev; b != &bcache.head; b = b−>prev){ if((b−>flags & B_BUSY) == 0){ b−>dev = dev; b−>sector = sector; b−>flags = B_BUSY; release(&bcache.lock); return b; } } panic("bget: no buffers"); } Sheet 39 // Return a B_BUSY buf with the contents of the indicated disk sector. struct buf* bread(uint dev, uint sector) { struct buf *b; b = bget(dev, sector); if(!(b−>flags & B_VALID)) iderw(b); return b; } // Write b’s contents to disk. Must be locked. void bwrite(struct buf *b) { if((b−>flags & B_BUSY) == 0) panic("bwrite"); b−>flags |= B_DIRTY; iderw(b); } // Release the buffer b. void brelse(struct buf *b) { if((b−>flags & B_BUSY) == 0) panic("brelse"); acquire(&bcache.lock); b−>next−>prev = b−>prev; b−>prev−>next = b−>next; b−>next = bcache.head.next; b−>prev = &bcache.head; bcache.head.next−>prev = b; bcache.head.next = b; b−>flags &= ~B_BUSY; wakeup(b); release(&bcache.lock); } Sheet 39 Sep 5 23:39 2011 xv6/log.c Page 1 Sep 5 23:39 2011 xv6/log.c Page 2 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 #include #include #include #include #include #include // // // // // // // // // // // // // // // // // // // // // // // // // "types.h" "defs.h" "param.h" "spinlock.h" "fs.h" "buf.h" Simple logging. Each system call that might write the file system should be surrounded with begin_trans() and commit_trans() calls. The log holds at most one transaction at a time. Commit forces the log (with commit record) to disk, then installs the affected blocks to disk, then erases the log. begin_trans() ensures that only one system call can be in a transaction; others must wait. Allowing only one transaction at a time means that the file system code doesn’t have to worry about the possibility of one transaction reading a block that another one has modified, for example an i−node block. Read−only system calls don’t need to use transactions, though this means that they may observe uncommitted data. I−node and buffer locks prevent read−only calls from seeing inconsistent data. The log is a physical re−do log containing disk blocks. The on−disk log format: header block, containing sector #s for block A, B, C, ... block A block B block C ... Log appends are synchronous. // Contents of the header block, used for both the on−disk header block // and to keep track in memory of logged sector #s before commit. struct logheader { int n; int sector[LOGSIZE]; }; struct log { struct spinlock lock; int start; int size; int intrans; int dev; struct logheader lh; }; Sheet 40 struct log log; static void recover_from_log(void); void initlog(void) { if (sizeof(struct logheader) >= BSIZE) panic("initlog: too big logheader"); struct superblock sb; initlock(&log.lock, "log"); readsb(ROOTDEV, &sb); log.start = sb.size − sb.nlog; log.size = sb.nlog; log.dev = ROOTDEV; recover_from_log(); } // Copy committed blocks from log to their home location static void install_trans(void) { int tail; for (tail = 0; tail < log.lh.n; tail++) { struct buf *lbuf = bread(log.dev, log.start+tail+1); // read log block struct buf *dbuf = bread(log.dev, log.lh.sector[tail]); // read dst memmove(dbuf−>data, lbuf−>data, BSIZE); // copy block to dst bwrite(dbuf); // flush dst to disk brelse(lbuf); brelse(dbuf); } } // Read the log header from disk into the in−memory log header static void read_head(void) { struct buf *buf = bread(log.dev, log.start); struct logheader *lh = (struct logheader *) (buf−>data); int i; log.lh.n = lh−>n; for (i = 0; i < log.lh.n; i++) { log.lh.sector[i] = lh−>sector[i]; } brelse(buf); } Sheet 40 Sep 5 23:39 2011 xv6/log.c Page 3 Sep 5 23:39 2011 xv6/log.c Page 4 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110 4111 4112 4113 4114 4115 4116 4117 4118 4119 4120 4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 // Write in−memory log header to disk, committing log entries till head static void write_head(void) { struct buf *buf = bread(log.dev, log.start); struct logheader *hb = (struct logheader *) (buf−>data); int i; hb−>n = log.lh.n; for (i = 0; i < log.lh.n; i++) { hb−>sector[i] = log.lh.sector[i]; } bwrite(buf); brelse(buf); } static void recover_from_log(void) { read_head(); install_trans(); // if committed, copy from log to disk log.lh.n = 0; write_head(); // clear the log } void begin_trans(void) { acquire(&log.lock); while (log.intrans) { sleep(&log, &log.lock); } log.intrans = 1; release(&log.lock); } void commit_trans(void) { if (log.lh.n > 0) { write_head(); // Causes all blocks till log.head to be commited install_trans(); // Install all the transactions till head log.lh.n = 0; write_head(); // Reclaim log } acquire(&log.lock); log.intrans = 0; wakeup(&log); release(&log.lock); } Sheet 41 // Caller has modified b−>data and is done with the buffer. // Append the block to the log and record the block number, // but don’t write the log header (which would commit the write). // log_write() replaces bwrite(); a typical use is: // bp = bread(...) // modify bp−>data[] // log_write(bp) // brelse(bp) void log_write(struct buf *b) { int i; if (log.lh.n >= LOGSIZE || log.lh.n >= log.size − 1) panic("too big a transaction"); if (!log.intrans) panic("write outside of trans"); for (i = 0; i < log.lh.n; i++) { if (log.lh.sector[i] == b−>sector) // log absorbtion? break; } log.lh.sector[i] = b−>sector; struct buf *lbuf = bread(b−>dev, log.start+i+1); memmove(lbuf−>data, b−>data, BSIZE); bwrite(lbuf); brelse(lbuf); if (i == log.lh.n) log.lh.n++; } Sheet 41 Sep 5 23:39 2011 xv6/log.c Page 5 Sep 5 23:39 2011 xv6/fs.c Page 1 4200 // Blank page. 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231 4232 4233 4234 4235 4236 4237 4238 4239 4240 4241 4242 4243 4244 4245 4246 4247 4248 4249 4250 4251 4252 4253 4254 4255 4256 4257 4258 4259 4260 4261 4262 4263 4264 4265 4266 4267 4268 4269 4270 4271 4272 4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 Sheet 42 Sheet 42 // // // // // // // // // // // File system implementation. Four layers: + Blocks: allocator for raw disk blocks. + Files: inode allocator, reading, writing, metadata. + Directories: inode with special contents (list of other inodes!) + Names: paths like /usr/rtm/xv6/fs.c for convenient naming. Disk layout is: superblock, inodes, block in−use bitmap, data blocks. This file contains the low−level file system manipulation routines. The (higher−level) system call implementations are in sysfile.c. #include #include #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "stat.h" "mmu.h" "proc.h" "spinlock.h" "buf.h" "fs.h" "file.h" #define min(a, b) ((a) < (b) ? (a) : (b)) static void itrunc(struct inode*); // Read the super block. void readsb(int dev, struct superblock *sb) { struct buf *bp; bp = bread(dev, 1); memmove(sb, bp−>data, sizeof(*sb)); brelse(bp); } // Zero a block. static void bzero(int dev, int bno) { struct buf *bp; bp = bread(dev, bno); memset(bp−>data, 0, BSIZE); log_write(bp); brelse(bp); } Sep 5 23:39 2011 xv6/fs.c Page 2 Sep 5 23:39 2011 xv6/fs.c Page 3 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316 4317 4318 4319 4320 4321 4322 4323 4324 4325 4326 4327 4328 4329 4330 4331 4332 4333 4334 4335 4336 4337 4338 4339 4340 4341 4342 4343 4344 4345 4346 4347 4348 4349 4350 4351 4352 4353 4354 4355 4356 4357 4358 4359 4360 4361 4362 4363 4364 4365 4366 4367 4368 4369 4370 4371 4372 4373 4374 4375 4376 4377 4378 4379 4380 4381 4382 4383 4384 4385 4386 4387 4388 4389 4390 4391 4392 4393 4394 4395 4396 4397 4398 4399 // Blocks. // Allocate a zeroed disk block. static uint balloc(uint dev) { int b, bi, m; struct buf *bp; struct superblock sb; bp = 0; readsb(dev, &sb); for(b = 0; b < sb.size; b += BPB){ bp = bread(dev, BBLOCK(b, sb.ninodes)); for(bi = 0; bi < BPB && bi < (sb.size − b); bi++){ m = 1 << (bi % 8); if((bp−>data[bi/8] & m) == 0){ // Is block free? bp−>data[bi/8] |= m; // Mark block in use on disk. log_write(bp); brelse(bp); bzero(dev, b + bi); return b + bi; } } brelse(bp); } panic("balloc: out of blocks"); } // Free a disk block. static void bfree(int dev, uint b) { struct buf *bp; struct superblock sb; int bi, m; readsb(dev, &sb); bp = bread(dev, BBLOCK(b, sb.ninodes)); bi = b % BPB; m = 1 << (bi % 8); if((bp−>data[bi/8] & m) == 0) panic("freeing free block"); bp−>data[bi/8] &= ~m; // Mark block free on disk. log_write(bp); brelse(bp); } Sheet 43 // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // Inodes. An inode is a single, unnamed file in the file system. The inode disk structure holds metadata (the type, device numbers, and data size) along with a list of blocks where the associated data can be found. The inodes are laid out sequentially on disk immediately after the superblock. The kernel keeps a cache of the in−use on−disk structures to provide a place for synchronizing access to inodes shared between multiple processes. ip−>ref counts the number of pointer references to this cached inode; references are typically kept in struct file and in proc−>cwd. When ip−>ref falls to zero, the inode is no longer cached. It is an error to use an inode without holding a reference to it. Processes are only allowed to read and write inode metadata and contents when holding the inode’s lock, represented by the I_BUSY flag in the in−memory copy. Because inode locks are held during disk accesses, they are implemented using a flag rather than with spin locks. Callers are responsible for locking inodes before passing them to routines in this file; leaving this responsibility with the caller makes it possible for them to create arbitrarily−sized atomic operations. To give maximum control over locking to the callers, the routines in this file that return inode pointers return pointers to *unlocked* inodes. It is the callers’ responsibility to lock them before using them. A non−zero ip−>ref keeps these unlocked inodes in the cache. struct { struct spinlock lock; struct inode inode[NINODE]; } icache; void iinit(void) { initlock(&icache.lock, "icache"); } static struct inode* iget(uint dev, uint inum); Sheet 43 Sep 5 23:39 2011 xv6/fs.c Page 4 Sep 5 23:39 2011 xv6/fs.c Page 5 4400 4401 4402 4403 4404 4405 4406 4407 4408 4409 4410 4411 4412 4413 4414 4415 4416 4417 4418 4419 4420 4421 4422 4423 4424 4425 4426 4427 4428 4429 4430 4431 4432 4433 4434 4435 4436 4437 4438 4439 4440 4441 4442 4443 4444 4445 4446 4447 4448 4449 4450 4451 4452 4453 4454 4455 4456 4457 4458 4459 4460 4461 4462 4463 4464 4465 4466 4467 4468 4469 4470 4471 4472 4473 4474 4475 4476 4477 4478 4479 4480 4481 4482 4483 4484 4485 4486 4487 4488 4489 4490 4491 4492 4493 4494 4495 4496 4497 4498 4499 // Allocate a new inode with the given type on device dev. struct inode* ialloc(uint dev, short type) { int inum; struct buf *bp; struct dinode *dip; struct superblock sb; readsb(dev, &sb); for(inum = 1; inum < sb.ninodes; inum++){ // loop over inode blocks bp = bread(dev, IBLOCK(inum)); dip = (struct dinode*)bp−>data + inum%IPB; if(dip−>type == 0){ // a free inode memset(dip, 0, sizeof(*dip)); dip−>type = type; log_write(bp); // mark it allocated on the disk brelse(bp); return iget(dev, inum); } brelse(bp); } panic("ialloc: no inodes"); } // Copy inode, which has changed, from memory to disk. void iupdate(struct inode *ip) { struct buf *bp; struct dinode *dip; bp = bread(ip−>dev, IBLOCK(ip−>inum)); dip = (struct dinode*)bp−>data + ip−>inum%IPB; dip−>type = ip−>type; dip−>major = ip−>major; dip−>minor = ip−>minor; dip−>nlink = ip−>nlink; dip−>size = ip−>size; memmove(dip−>addrs, ip−>addrs, sizeof(ip−>addrs)); log_write(bp); brelse(bp); } Sheet 44 // Find the inode with number inum on device dev // and return the in−memory copy. static struct inode* iget(uint dev, uint inum) { struct inode *ip, *empty; acquire(&icache.lock); // Try for cached inode. empty = 0; for(ip = &icache.inode[0]; ip < &icache.inode[NINODE]; ip++){ if(ip−>ref > 0 && ip−>dev == dev && ip−>inum == inum){ ip−>ref++; release(&icache.lock); return ip; } if(empty == 0 && ip−>ref == 0) // Remember empty slot. empty = ip; } // Allocate fresh inode. if(empty == 0) panic("iget: no inodes"); ip = empty; ip−>dev = dev; ip−>inum = inum; ip−>ref = 1; ip−>flags = 0; release(&icache.lock); return ip; } // Increment reference count for ip. // Returns ip to enable ip = idup(ip1) idiom. struct inode* idup(struct inode *ip) { acquire(&icache.lock); ip−>ref++; release(&icache.lock); return ip; } Sheet 44 Sep 5 23:39 2011 xv6/fs.c Page 6 Sep 5 23:39 2011 xv6/fs.c Page 7 4500 4501 4502 4503 4504 4505 4506 4507 4508 4509 4510 4511 4512 4513 4514 4515 4516 4517 4518 4519 4520 4521 4522 4523 4524 4525 4526 4527 4528 4529 4530 4531 4532 4533 4534 4535 4536 4537 4538 4539 4540 4541 4542 4543 4544 4545 4546 4547 4548 4549 4550 4551 4552 4553 4554 4555 4556 4557 4558 4559 4560 4561 4562 4563 4564 4565 4566 4567 4568 4569 4570 4571 4572 4573 4574 4575 4576 4577 4578 4579 4580 4581 4582 4583 4584 4585 4586 4587 4588 4589 4590 4591 4592 4593 4594 4595 4596 4597 4598 4599 // Lock the given inode. void ilock(struct inode *ip) { struct buf *bp; struct dinode *dip; if(ip == 0 || ip−>ref < 1) panic("ilock"); acquire(&icache.lock); while(ip−>flags & I_BUSY) sleep(ip, &icache.lock); ip−>flags |= I_BUSY; release(&icache.lock); if(!(ip−>flags & I_VALID)){ bp = bread(ip−>dev, IBLOCK(ip−>inum)); dip = (struct dinode*)bp−>data + ip−>inum%IPB; ip−>type = dip−>type; ip−>major = dip−>major; ip−>minor = dip−>minor; ip−>nlink = dip−>nlink; ip−>size = dip−>size; memmove(ip−>addrs, dip−>addrs, sizeof(ip−>addrs)); brelse(bp); ip−>flags |= I_VALID; if(ip−>type == 0) panic("ilock: no type"); } } // Unlock the given inode. void iunlock(struct inode *ip) { if(ip == 0 || !(ip−>flags & I_BUSY) || ip−>ref < 1) panic("iunlock"); acquire(&icache.lock); ip−>flags &= ~I_BUSY; wakeup(ip); release(&icache.lock); } Sheet 45 // Caller holds reference to unlocked ip. Drop reference. void iput(struct inode *ip) { acquire(&icache.lock); if(ip−>ref == 1 && (ip−>flags & I_VALID) && ip−>nlink == 0){ // inode is no longer used: truncate and free inode. if(ip−>flags & I_BUSY) panic("iput busy"); ip−>flags |= I_BUSY; release(&icache.lock); itrunc(ip); ip−>type = 0; iupdate(ip); acquire(&icache.lock); ip−>flags = 0; wakeup(ip); } ip−>ref−−; release(&icache.lock); } // Common idiom: unlock, then put. void iunlockput(struct inode *ip) { iunlock(ip); iput(ip); } Sheet 45 Sep 5 23:39 2011 xv6/fs.c Page 8 Sep 5 23:39 2011 xv6/fs.c Page 9 4600 4601 4602 4603 4604 4605 4606 4607 4608 4609 4610 4611 4612 4613 4614 4615 4616 4617 4618 4619 4620 4621 4622 4623 4624 4625 4626 4627 4628 4629 4630 4631 4632 4633 4634 4635 4636 4637 4638 4639 4640 4641 4642 4643 4644 4645 4646 4647 4648 4649 4650 4651 4652 4653 4654 4655 4656 4657 4658 4659 4660 4661 4662 4663 4664 4665 4666 4667 4668 4669 4670 4671 4672 4673 4674 4675 4676 4677 4678 4679 4680 4681 4682 4683 4684 4685 4686 4687 4688 4689 4690 4691 4692 4693 4694 4695 4696 4697 4698 4699 // // // // // // Inode contents The contents (data) associated with each inode is stored in a sequence of blocks on the disk. The first NDIRECT blocks are listed in ip−>addrs[]. The next NINDIRECT blocks are listed in the block ip−>addrs[NDIRECT]. // Return the disk block address of the nth block in inode ip. // If there is no such block, bmap allocates one. static uint bmap(struct inode *ip, uint bn) { uint addr, *a; struct buf *bp; if(bn < NDIRECT){ if((addr = ip−>addrs[bn]) == 0) ip−>addrs[bn] = addr = balloc(ip−>dev); return addr; } bn −= NDIRECT; if(bn < NINDIRECT){ // Load indirect block, allocating if necessary. if((addr = ip−>addrs[NDIRECT]) == 0) ip−>addrs[NDIRECT] = addr = balloc(ip−>dev); bp = bread(ip−>dev, addr); a = (uint*)bp−>data; if((addr = a[bn]) == 0){ a[bn] = addr = balloc(ip−>dev); log_write(bp); } brelse(bp); return addr; } panic("bmap: out of range"); } Sheet 46 // Truncate inode (discard contents). // Only called after the last dirent referring // to this inode has been erased on disk. static void itrunc(struct inode *ip) { int i, j; struct buf *bp; uint *a; for(i = 0; i < NDIRECT; i++){ if(ip−>addrs[i]){ bfree(ip−>dev, ip−>addrs[i]); ip−>addrs[i] = 0; } } if(ip−>addrs[NDIRECT]){ bp = bread(ip−>dev, ip−>addrs[NDIRECT]); a = (uint*)bp−>data; for(j = 0; j < NINDIRECT; j++){ if(a[j]) bfree(ip−>dev, a[j]); } brelse(bp); bfree(ip−>dev, ip−>addrs[NDIRECT]); ip−>addrs[NDIRECT] = 0; } ip−>size = 0; iupdate(ip); } // Copy stat information from inode. void stati(struct inode *ip, struct stat *st) { st−>dev = ip−>dev; st−>ino = ip−>inum; st−>type = ip−>type; st−>nlink = ip−>nlink; st−>size = ip−>size; } Sheet 46 Sep 5 23:39 2011 xv6/fs.c Page 10 Sep 5 23:39 2011 xv6/fs.c Page 11 4700 4701 4702 4703 4704 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 4715 4716 4717 4718 4719 4720 4721 4722 4723 4724 4725 4726 4727 4728 4729 4730 4731 4732 4733 4734 4735 4736 4737 4738 4739 4740 4741 4742 4743 4744 4745 4746 4747 4748 4749 4750 4751 4752 4753 4754 4755 4756 4757 4758 4759 4760 4761 4762 4763 4764 4765 4766 4767 4768 4769 4770 4771 4772 4773 4774 4775 4776 4777 4778 4779 4780 4781 4782 4783 4784 4785 4786 4787 4788 4789 4790 4791 4792 4793 4794 4795 4796 4797 4798 4799 // Read data from inode. int readi(struct inode *ip, char *dst, uint off, uint n) { uint tot, m; struct buf *bp; if(ip−>type == T_DEV){ if(ip−>major < 0 || ip−>major >= NDEV || !devsw[ip−>major].read) return −1; return devsw[ip−>major].read(ip, dst, n); } if(off > ip−>size || off + n < off) return −1; if(off + n > ip−>size) n = ip−>size − off; for(tot=0; totdev, bmap(ip, off/BSIZE)); m = min(n − tot, BSIZE − off%BSIZE); memmove(dst, bp−>data + off%BSIZE, m); brelse(bp); } return n; } Sheet 47 // Write data to inode. int writei(struct inode *ip, char *src, uint off, uint n) { uint tot, m; struct buf *bp; if(ip−>type == T_DEV){ if(ip−>major < 0 || ip−>major >= NDEV || !devsw[ip−>major].write) return −1; return devsw[ip−>major].write(ip, src, n); } if(off > return if(off + return ip−>size || off + n < off) −1; n > MAXFILE*BSIZE) −1; for(tot=0; totdev, bmap(ip, off/BSIZE)); m = min(n − tot, BSIZE − off%BSIZE); memmove(bp−>data + off%BSIZE, src, m); log_write(bp); brelse(bp); } if(n > 0 && off > ip−>size){ ip−>size = off; iupdate(ip); } return n; } Sheet 47 Sep 5 23:39 2011 xv6/fs.c Page 12 Sep 5 23:39 2011 xv6/fs.c Page 13 4800 4801 4802 4803 4804 4805 4806 4807 4808 4809 4810 4811 4812 4813 4814 4815 4816 4817 4818 4819 4820 4821 4822 4823 4824 4825 4826 4827 4828 4829 4830 4831 4832 4833 4834 4835 4836 4837 4838 4839 4840 4841 4842 4843 4844 4845 4846 4847 4848 4849 4850 4851 4852 4853 4854 4855 4856 4857 4858 4859 4860 4861 4862 4863 4864 4865 4866 4867 4868 4869 4870 4871 4872 4873 4874 4875 4876 4877 4878 4879 4880 4881 4882 4883 4884 4885 4886 4887 4888 4889 4890 4891 4892 4893 4894 4895 4896 4897 4898 4899 // Directories int namecmp(const char *s, const char *t) { return strncmp(s, t, DIRSIZ); } // Look for a directory entry in a directory. // If found, set *poff to byte offset of entry. // Caller must have already locked dp. struct inode* dirlookup(struct inode *dp, char *name, uint *poff) { uint off, inum; struct dirent de; if(dp−>type != T_DIR) panic("dirlookup not DIR"); for(off = 0; off < dp−>size; off += sizeof(de)){ if(readi(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) panic("dirlink read"); if(de.inum == 0) continue; if(namecmp(name, de.name) == 0){ // entry matches path element if(poff) *poff = off; inum = de.inum; return iget(dp−>dev, inum); } } return 0; } Sheet 48 // Write a new directory entry (name, inum) into the directory dp. int dirlink(struct inode *dp, char *name, uint inum) { int off; struct dirent de; struct inode *ip; // Check that name is not present. if((ip = dirlookup(dp, name, 0)) != 0){ iput(ip); return −1; } // Look for an empty dirent. for(off = 0; off < dp−>size; off += sizeof(de)){ if(readi(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) panic("dirlink read"); if(de.inum == 0) break; } strncpy(de.name, name, DIRSIZ); de.inum = inum; if(writei(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) panic("dirlink"); return 0; } Sheet 48 Sep 5 23:39 2011 xv6/fs.c Page 14 Sep 5 23:39 2011 xv6/fs.c Page 15 4900 4901 4902 4903 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4923 4924 4925 4926 4927 4928 4929 4930 4931 4932 4933 4934 4935 4936 4937 4938 4939 4940 4941 4942 4943 4944 4945 4946 4947 4948 4949 4950 4951 4952 4953 4954 4955 4956 4957 4958 4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974 4975 4976 4977 4978 4979 4980 4981 4982 4983 4984 4985 4986 4987 4988 4989 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 // Paths // Copy the next path element from path into name. // Return a pointer to the element following the copied one. // The returned path has no leading slashes, // so the caller can check *path==’\0’ to see if the name is the last one. // If no name to remove, return 0. // // Examples: // skipelem("a/bb/c", name) = "bb/c", setting name = "a" // skipelem("///a//bb", name) = "bb", setting name = "a" // skipelem("a", name) = "", setting name = "a" // skipelem("", name) = skipelem("////", name) = 0 // static char* skipelem(char *path, char *name) { char *s; int len; while(*path == ’/’) path++; if(*path == 0) return 0; s = path; while(*path != ’/’ && *path != 0) path++; len = path − s; if(len >= DIRSIZ) memmove(name, s, DIRSIZ); else { memmove(name, s, len); name[len] = 0; } while(*path == ’/’) path++; return path; } Sheet 49 // Look up and return the inode for a path name. // If parent != 0, return the inode for the parent and copy the final // path element into name, which must have room for DIRSIZ bytes. static struct inode* namex(char *path, int nameiparent, char *name) { struct inode *ip, *next; if(*path == ’/’) ip = iget(ROOTDEV, ROOTINO); else ip = idup(proc−>cwd); while((path = skipelem(path, name)) != 0){ ilock(ip); if(ip−>type != T_DIR){ iunlockput(ip); return 0; } if(nameiparent && *path == ’\0’){ // Stop one level early. iunlock(ip); return ip; } if((next = dirlookup(ip, name, 0)) == 0){ iunlockput(ip); return 0; } iunlockput(ip); ip = next; } if(nameiparent){ iput(ip); return 0; } return ip; } struct inode* namei(char *path) { char name[DIRSIZ]; return namex(path, 0, name); } struct inode* nameiparent(char *path, char *name) { return namex(path, 1, name); } Sheet 49 Sep 5 23:39 2011 xv6/file.c Page 1 Sep 5 23:39 2011 xv6/file.c Page 2 5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010 5011 5012 5013 5014 5015 5016 5017 5018 5019 5020 5021 5022 5023 5024 5025 5026 5027 5028 5029 5030 5031 5032 5033 5034 5035 5036 5037 5038 5039 5040 5041 5042 5043 5044 5045 5046 5047 5048 5049 5050 5051 5052 5053 5054 5055 5056 5057 5058 5059 5060 5061 5062 5063 5064 5065 5066 5067 5068 5069 5070 5071 5072 5073 5074 5075 5076 5077 5078 5079 5080 5081 5082 5083 5084 5085 5086 5087 5088 5089 5090 5091 5092 5093 5094 5095 5096 5097 5098 5099 #include #include #include #include #include #include "types.h" "defs.h" "param.h" "fs.h" "file.h" "spinlock.h" struct devsw devsw[NDEV]; struct { struct spinlock lock; struct file file[NFILE]; } ftable; void fileinit(void) { initlock(&ftable.lock, "ftable"); } // Allocate a file structure. struct file* filealloc(void) { struct file *f; acquire(&ftable.lock); for(f = ftable.file; f < ftable.file + NFILE; f++){ if(f−>ref == 0){ f−>ref = 1; release(&ftable.lock); return f; } } release(&ftable.lock); return 0; } // Increment ref count for file f. struct file* filedup(struct file *f) { acquire(&ftable.lock); if(f−>ref < 1) panic("filedup"); f−>ref++; release(&ftable.lock); return f; } Sheet 50 // Close file f. (Decrement ref count, close when reaches 0.) void fileclose(struct file *f) { struct file ff; acquire(&ftable.lock); if(f−>ref < 1) panic("fileclose"); if(−−f−>ref > 0){ release(&ftable.lock); return; } ff = *f; f−>ref = 0; f−>type = FD_NONE; release(&ftable.lock); if(ff.type == FD_PIPE) pipeclose(ff.pipe, ff.writable); else if(ff.type == FD_INODE){ begin_trans(); iput(ff.ip); commit_trans(); } } // Get metadata about file f. int filestat(struct file *f, struct stat *st) { if(f−>type == FD_INODE){ ilock(f−>ip); stati(f−>ip, st); iunlock(f−>ip); return 0; } return −1; } Sheet 50 Sep 5 23:39 2011 xv6/file.c Page 3 Sep 5 23:39 2011 xv6/file.c Page 4 5100 5101 5102 5103 5104 5105 5106 5107 5108 5109 5110 5111 5112 5113 5114 5115 5116 5117 5118 5119 5120 5121 5122 5123 5124 5125 5126 5127 5128 5129 5130 5131 5132 5133 5134 5135 5136 5137 5138 5139 5140 5141 5142 5143 5144 5145 5146 5147 5148 5149 5150 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 // Read from file f. Addr is kernel address. int fileread(struct file *f, char *addr, int n) { int r; if(f−>readable == 0) return −1; if(f−>type == FD_PIPE) return piperead(f−>pipe, addr, n); if(f−>type == FD_INODE){ ilock(f−>ip); if((r = readi(f−>ip, addr, f−>off, n)) > 0) f−>off += r; iunlock(f−>ip); return r; } panic("fileread"); } Sheet 51 // Write to file f. Addr is kernel address. int filewrite(struct file *f, char *addr, int n) { int r; if(f−>writable == 0) return −1; if(f−>type == FD_PIPE) return pipewrite(f−>pipe, addr, n); if(f−>type == FD_INODE){ // write a few blocks at a time to avoid exceeding // the maximum log transaction size, including // i−node, indirect block, allocation blocks, // and 2 blocks of slop for non−aligned writes. // this really belongs lower down, since writei() // might be writing a device like the console. int max = ((LOGSIZE−1−1−2) / 2) * 512; int i = 0; while(i < n){ int n1 = n − i; if(n1 > max) n1 = max; begin_trans(); ilock(f−>ip); if ((r = writei(f−>ip, addr + i, f−>off, n1)) > 0) f−>off += r; iunlock(f−>ip); commit_trans(); if(r < 0) break; if(r != n1) panic("short filewrite"); i += r; } return i == n ? n : −1; } panic("filewrite"); } Sheet 51 Sep 5 23:39 2011 xv6/sysfile.c Page 1 Sep 5 23:39 2011 xv6/sysfile.c Page 2 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5250 5251 5252 5253 5254 5255 5256 5257 5258 5259 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 #include #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "stat.h" "mmu.h" "proc.h" "fs.h" "file.h" "fcntl.h" // Fetch the nth word−sized system call argument as a file descriptor // and return both the descriptor and the corresponding struct file. static int argfd(int n, int *pfd, struct file **pf) { int fd; struct file *f; if(argint(n, &fd) < 0) return −1; if(fd < 0 || fd >= NOFILE || (f=proc−>ofile[fd]) == 0) return −1; if(pfd) *pfd = fd; if(pf) *pf = f; return 0; } // Allocate a file descriptor for the given file. // Takes over file reference from caller on success. static int fdalloc(struct file *f) { int fd; for(fd = 0; fd < NOFILE; fd++){ if(proc−>ofile[fd] == 0){ proc−>ofile[fd] = f; return fd; } } return −1; } Sheet 52 int sys_dup(void) { struct file *f; int fd; if(argfd(0, 0, &f) < 0) return −1; if((fd=fdalloc(f)) < 0) return −1; filedup(f); return fd; } int sys_read(void) { struct file *f; int n; char *p; if(argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) return −1; return fileread(f, p, n); } int sys_write(void) { struct file *f; int n; char *p; if(argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) return −1; return filewrite(f, p, n); } int sys_close(void) { int fd; struct file *f; if(argfd(0, &fd, &f) < 0) return −1; proc−>ofile[fd] = 0; fileclose(f); return 0; } Sheet 52 Sep 5 23:39 2011 xv6/sysfile.c Page 3 Sep 5 23:39 2011 xv6/sysfile.c Page 4 5300 5301 5302 5303 5304 5305 5306 5307 5308 5309 5310 5311 5312 5313 5314 5315 5316 5317 5318 5319 5320 5321 5322 5323 5324 5325 5326 5327 5328 5329 5330 5331 5332 5333 5334 5335 5336 5337 5338 5339 5340 5341 5342 5343 5344 5345 5346 5347 5348 5349 5350 5351 5352 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378 5379 5380 5381 5382 5383 5384 5385 5386 5387 5388 5389 5390 5391 5392 5393 5394 5395 5396 5397 5398 5399 int sys_fstat(void) { struct file *f; struct stat *st; if(argfd(0, 0, &f) < 0 || argptr(1, (void*)&st, sizeof(*st)) < 0) return −1; return filestat(f, st); } // Create the path new as a link to the same inode as old. int sys_link(void) { char name[DIRSIZ], *new, *old; struct inode *dp, *ip; if(argstr(0, &old) < 0 || argstr(1, &new) < 0) return −1; if((ip = namei(old)) == 0) return −1; begin_trans(); ilock(ip); if(ip−>type == T_DIR){ iunlockput(ip); commit_trans(); return −1; } ip−>nlink++; iupdate(ip); iunlock(ip); if((dp = nameiparent(new, name)) == 0) goto bad; ilock(dp); if(dp−>dev != ip−>dev || dirlink(dp, name, ip−>inum) < 0){ iunlockput(dp); goto bad; } iunlockput(dp); iput(ip); commit_trans(); return 0; Sheet 53 bad: ilock(ip); ip−>nlink−−; iupdate(ip); iunlockput(ip); commit_trans(); return −1; } // Is the directory dp empty except for "." and ".." ? static int isdirempty(struct inode *dp) { int off; struct dirent de; for(off=2*sizeof(de); offsize; off+=sizeof(de)){ if(readi(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) panic("isdirempty: readi"); if(de.inum != 0) return 0; } return 1; } Sheet 53 Sep 5 23:39 2011 xv6/sysfile.c Page 5 Sep 5 23:39 2011 xv6/sysfile.c Page 6 5400 int 5401 sys_unlink(void) 5402 { 5403 struct inode *ip, *dp; 5404 struct dirent de; 5405 char name[DIRSIZ], *path; 5406 uint off; 5407 5408 if(argstr(0, &path) < 0) 5409 return −1; 5410 if((dp = nameiparent(path, name)) == 0) 5411 return −1; 5412 5413 begin_trans(); 5414 5415 ilock(dp); 5416 5417 // Cannot unlink "." or "..". 5418 if(namecmp(name, ".") == 0 || namecmp(name, "..") == 0) 5419 goto bad; 5420 5421 if((ip = dirlookup(dp, name, &off)) == 0) 5422 goto bad; 5423 ilock(ip); 5424 5425 if(ip−>nlink < 1) 5426 panic("unlink: nlink < 1"); 5427 if(ip−>type == T_DIR && !isdirempty(ip)){ 5428 iunlockput(ip); 5429 goto bad; 5430 } 5431 5432 memset(&de, 0, sizeof(de)); 5433 if(writei(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) 5434 panic("unlink: writei"); 5435 if(ip−>type == T_DIR){ 5436 dp−>nlink−−; 5437 iupdate(dp); 5438 } 5439 iunlockput(dp); 5440 5441 ip−>nlink−−; 5442 iupdate(ip); 5443 iunlockput(ip); 5444 5445 commit_trans(); 5446 5447 return 0; 5448 5449 5450 5451 5452 5453 5454 5455 5456 5457 5458 5459 5460 5461 5462 5463 5464 5465 5466 5467 5468 5469 5470 5471 5472 5473 5474 5475 5476 5477 5478 5479 5480 5481 5482 5483 5484 5485 5486 5487 5488 5489 5490 5491 5492 5493 5494 5495 5496 5497 5498 5499 Sheet 54 Sheet 54 bad: iunlockput(dp); commit_trans(); return −1; } static struct inode* create(char *path, short type, short major, short minor) { uint off; struct inode *ip, *dp; char name[DIRSIZ]; if((dp = nameiparent(path, name)) == 0) return 0; ilock(dp); if((ip = dirlookup(dp, name, &off)) != 0){ iunlockput(dp); ilock(ip); if(type == T_FILE && ip−>type == T_FILE) return ip; iunlockput(ip); return 0; } if((ip = ialloc(dp−>dev, type)) == 0) panic("create: ialloc"); ilock(ip); ip−>major = major; ip−>minor = minor; ip−>nlink = 1; iupdate(ip); if(type == T_DIR){ // Create . and .. entries. dp−>nlink++; // for ".." iupdate(dp); // No ip−>nlink++ for ".": avoid cyclic ref count. if(dirlink(ip, ".", ip−>inum) < 0 || dirlink(ip, "..", dp−>inum) < 0) panic("create dots"); } if(dirlink(dp, name, ip−>inum) < 0) panic("create: dirlink"); iunlockput(dp); return ip; } Sep 5 23:39 2011 xv6/sysfile.c Page 7 Sep 5 23:39 2011 xv6/sysfile.c Page 8 5500 5501 5502 5503 5504 5505 5506 5507 5508 5509 5510 5511 5512 5513 5514 5515 5516 5517 5518 5519 5520 5521 5522 5523 5524 5525 5526 5527 5528 5529 5530 5531 5532 5533 5534 5535 5536 5537 5538 5539 5540 5541 5542 5543 5544 5545 5546 5547 5548 5549 5550 5551 5552 5553 5554 5555 5556 5557 5558 5559 5560 5561 5562 5563 5564 5565 5566 5567 5568 5569 5570 5571 5572 5573 5574 5575 5576 5577 5578 5579 5580 5581 5582 5583 5584 5585 5586 5587 5588 5589 5590 5591 5592 5593 5594 5595 5596 5597 5598 5599 int sys_open(void) { char *path; int fd, omode; struct file *f; struct inode *ip; if(argstr(0, &path) < 0 || argint(1, &omode) < 0) return −1; if(omode & O_CREATE){ begin_trans(); ip = create(path, T_FILE, 0, 0); commit_trans(); if(ip == 0) return −1; } else { if((ip = namei(path)) == 0) return −1; ilock(ip); if(ip−>type == T_DIR && omode != O_RDONLY){ iunlockput(ip); return −1; } } if((f = filealloc()) == 0 || (fd = fdalloc(f)) < 0){ if(f) fileclose(f); iunlockput(ip); return −1; } iunlock(ip); f−>type = FD_INODE; f−>ip = ip; f−>off = 0; f−>readable = !(omode & O_WRONLY); f−>writable = (omode & O_WRONLY) || (omode & O_RDWR); return fd; } Sheet 55 int sys_mkdir(void) { char *path; struct inode *ip; begin_trans(); if(argstr(0, &path) < 0 || (ip = create(path, T_DIR, 0, 0)) == 0){ commit_trans(); return −1; } iunlockput(ip); commit_trans(); return 0; } int sys_mknod(void) { struct inode *ip; char *path; int len; int major, minor; begin_trans(); if((len=argstr(0, &path)) < 0 || argint(1, &major) < 0 || argint(2, &minor) < 0 || (ip = create(path, T_DEV, major, minor)) == 0){ commit_trans(); return −1; } iunlockput(ip); commit_trans(); return 0; } Sheet 55 Sep 5 23:39 2011 xv6/sysfile.c Page 9 Sep 5 23:39 2011 xv6/sysfile.c Page 10 5600 5601 5602 5603 5604 5605 5606 5607 5608 5609 5610 5611 5612 5613 5614 5615 5616 5617 5618 5619 5620 5621 5622 5623 5624 5625 5626 5627 5628 5629 5630 5631 5632 5633 5634 5635 5636 5637 5638 5639 5640 5641 5642 5643 5644 5645 5646 5647 5648 5649 5650 5651 5652 5653 5654 5655 5656 5657 5658 5659 5660 5661 5662 5663 5664 5665 5666 5667 5668 5669 5670 5671 5672 5673 5674 5675 5676 5677 5678 5679 5680 5681 5682 5683 5684 5685 5686 5687 5688 5689 5690 5691 5692 5693 5694 5695 5696 5697 5698 5699 int sys_chdir(void) { char *path; struct inode *ip; if(argstr(0, &path) < 0 || (ip = namei(path)) == 0) return −1; ilock(ip); if(ip−>type != T_DIR){ iunlockput(ip); return −1; } iunlock(ip); iput(proc−>cwd); proc−>cwd = ip; return 0; } int sys_exec(void) { char *path, *argv[MAXARG]; int i; uint uargv, uarg; if(argstr(0, &path) < 0 || argint(1, (int*)&uargv) < 0){ return −1; } memset(argv, 0, sizeof(argv)); for(i=0;; i++){ if(i >= NELEM(argv)) return −1; if(fetchint(proc, uargv+4*i, (int*)&uarg) < 0) return −1; if(uarg == 0){ argv[i] = 0; break; } if(fetchstr(proc, uarg, &argv[i]) < 0) return −1; } return exec(path, argv); } Sheet 56 int sys_pipe(void) { int *fd; struct file *rf, *wf; int fd0, fd1; if(argptr(0, (void*)&fd, 2*sizeof(fd[0])) < 0) return −1; if(pipealloc(&rf, &wf) < 0) return −1; fd0 = −1; if((fd0 = fdalloc(rf)) < 0 || (fd1 = fdalloc(wf)) < 0){ if(fd0 >= 0) proc−>ofile[fd0] = 0; fileclose(rf); fileclose(wf); return −1; } fd[0] = fd0; fd[1] = fd1; return 0; } Sheet 56 Sep 5 23:39 2011 xv6/exec.c Page 1 Sep 5 23:39 2011 xv6/exec.c Page 2 5700 5701 5702 5703 5704 5705 5706 5707 5708 5709 5710 5711 5712 5713 5714 5715 5716 5717 5718 5719 5720 5721 5722 5723 5724 5725 5726 5727 5728 5729 5730 5731 5732 5733 5734 5735 5736 5737 5738 5739 5740 5741 5742 5743 5744 5745 5746 5747 5748 5749 5750 // Allocate two pages at the next page boundary. 5751 // Make the first inaccessible. Use the second as the user stack. 5752 sz = PGROUNDUP(sz); 5753 if((sz = allocuvm(pgdir, sz, sz + 2*PGSIZE)) == 0) 5754 goto bad; 5755 clearpteu(pgdir, (char*)(sz − 2*PGSIZE)); 5756 sp = sz; 5757 5758 // Push argument strings, prepare rest of stack in ustack. 5759 for(argc = 0; argv[argc]; argc++) { 5760 if(argc >= MAXARG) 5761 goto bad; 5762 sp = (sp − (strlen(argv[argc]) + 1)) & ~3; 5763 if(copyout(pgdir, sp, argv[argc], strlen(argv[argc]) + 1) < 0) 5764 goto bad; 5765 ustack[3+argc] = sp; 5766 } 5767 ustack[3+argc] = 0; 5768 5769 ustack[0] = 0xffffffff; // fake return PC 5770 ustack[1] = argc; 5771 ustack[2] = sp − (argc+1)*4; // argv pointer 5772 5773 sp −= (3+argc+1) * 4; 5774 if(copyout(pgdir, sp, ustack, (3+argc+1)*4) < 0) 5775 goto bad; 5776 5777 // Save program name for debugging. 5778 for(last=s=path; *s; s++) 5779 if(*s == ’/’) 5780 last = s+1; 5781 safestrcpy(proc−>name, last, sizeof(proc−>name)); 5782 5783 // Commit to the user image. 5784 oldpgdir = proc−>pgdir; 5785 proc−>pgdir = pgdir; 5786 proc−>sz = sz; 5787 proc−>tf−>eip = elf.entry; // main 5788 proc−>tf−>esp = sp; 5789 switchuvm(proc); 5790 freevm(oldpgdir); 5791 return 0; 5792 5793 bad: 5794 if(pgdir) 5795 freevm(pgdir); 5796 if(ip) 5797 iunlockput(ip); 5798 return −1; 5799 } #include #include #include #include #include #include #include #include "types.h" "param.h" "memlayout.h" "mmu.h" "proc.h" "defs.h" "x86.h" "elf.h" int exec(char *path, char **argv) { char *s, *last; int i, off; uint argc, sz, sp, ustack[3+MAXARG+1]; struct elfhdr elf; struct inode *ip; struct proghdr ph; pde_t *pgdir, *oldpgdir; if((ip = namei(path)) == 0) return −1; ilock(ip); pgdir = 0; // Check ELF header if(readi(ip, (char*)&elf, 0, sizeof(elf)) < sizeof(elf)) goto bad; if(elf.magic != ELF_MAGIC) goto bad; if((pgdir = setupkvm(kalloc)) == 0) goto bad; // Load program into memory. sz = 0; for(i=0, off=elf.phoff; ireadopen = 1; p−>writeopen = 1; p−>nwrite = 0; p−>nread = 0; initlock(&p−>lock, "pipe"); (*f0)−>type = FD_PIPE; (*f0)−>readable = 1; (*f0)−>writable = 0; (*f0)−>pipe = p; (*f1)−>type = FD_PIPE; (*f1)−>readable = 0; (*f1)−>writable = 1; (*f1)−>pipe = p; return 0; Sheet 58 bad: if(p) kfree((char*)p); if(*f0) fileclose(*f0); if(*f1) fileclose(*f1); return −1; } void pipeclose(struct pipe *p, int writable) { acquire(&p−>lock); if(writable){ p−>writeopen = 0; wakeup(&p−>nread); } else { p−>readopen = 0; wakeup(&p−>nwrite); } if(p−>readopen == 0 && p−>writeopen == 0){ release(&p−>lock); kfree((char*)p); } else release(&p−>lock); } int pipewrite(struct pipe *p, char *addr, int n) { int i; acquire(&p−>lock); for(i = 0; i < n; i++){ while(p−>nwrite == p−>nread + PIPESIZE){ if(p−>readopen == 0 || proc−>killed){ release(&p−>lock); return −1; } wakeup(&p−>nread); sleep(&p−>nwrite, &p−>lock); } p−>data[p−>nwrite++ % PIPESIZE] = addr[i]; } wakeup(&p−>nread); release(&p−>lock); return n; } Sheet 58 Sep 5 23:39 2011 xv6/pipe.c Page 3 Sep 5 23:39 2011 xv6/string.c Page 1 5900 5901 5902 5903 5904 5905 5906 5907 5908 5909 5910 5911 5912 5913 5914 5915 5916 5917 5918 5919 5920 5921 5922 5923 5924 5925 5926 5927 5928 5929 5930 5931 5932 5933 5934 5935 5936 5937 5938 5939 5940 5941 5942 5943 5944 5945 5946 5947 5948 5949 5950 5951 5952 5953 5954 5955 5956 5957 5958 5959 5960 5961 5962 5963 5964 5965 5966 5967 5968 5969 5970 5971 5972 5973 5974 5975 5976 5977 5978 5979 5980 5981 5982 5983 5984 5985 5986 5987 5988 5989 5990 5991 5992 5993 5994 5995 5996 5997 5998 5999 int piperead(struct pipe *p, char *addr, int n) { int i; acquire(&p−>lock); while(p−>nread == p−>nwrite && p−>writeopen){ if(proc−>killed){ release(&p−>lock); return −1; } sleep(&p−>nread, &p−>lock); } for(i = 0; i < n; i++){ if(p−>nread == p−>nwrite) break; addr[i] = p−>data[p−>nread++ % PIPESIZE]; } wakeup(&p−>nwrite); release(&p−>lock); return i; } Sheet 59 #include "types.h" #include "x86.h" void* memset(void *dst, int c, uint n) { if ((int)dst%4 == 0 && n%4 == 0){ c &= 0xFF; stosl(dst, (c<<24)|(c<<16)|(c<<8)|c, n/4); } else stosb(dst, c, n); return dst; } int memcmp(const void *v1, const void *v2, uint n) { const uchar *s1, *s2; s1 = v1; s2 = v2; while(n−− > 0){ if(*s1 != *s2) return *s1 − *s2; s1++, s2++; } return 0; } void* memmove(void *dst, const void *src, uint n) { const char *s; char *d; s = src; d = dst; if(s < d && s + n > d){ s += n; d += n; while(n−− > 0) *−−d = *−−s; } else while(n−− > 0) *d++ = *s++; return dst; } Sheet 59 Sep 5 23:39 2011 xv6/string.c Page 2 Sep 5 23:39 2011 xv6/string.c Page 3 6000 6001 6002 6003 6004 6005 6006 6007 6008 6009 6010 6011 6012 6013 6014 6015 6016 6017 6018 6019 6020 6021 6022 6023 6024 6025 6026 6027 6028 6029 6030 6031 6032 6033 6034 6035 6036 6037 6038 6039 6040 6041 6042 6043 6044 6045 6046 6047 6048 6049 6050 6051 6052 6053 6054 6055 6056 6057 6058 6059 6060 6061 6062 6063 6064 6065 6066 6067 6068 6069 6070 6071 6072 6073 6074 6075 6076 6077 6078 6079 6080 6081 6082 6083 6084 6085 6086 6087 6088 6089 6090 6091 6092 6093 6094 6095 6096 6097 6098 6099 // memcpy exists to placate GCC. Use memmove. void* memcpy(void *dst, const void *src, uint n) { return memmove(dst, src, n); } int strncmp(const char *p, const char *q, uint n) { while(n > 0 && *p && *p == *q) n−−, p++, q++; if(n == 0) return 0; return (uchar)*p − (uchar)*q; } char* strncpy(char *s, const char *t, int n) { char *os; os = s; while(n−− > 0 && (*s++ = *t++) != 0) ; while(n−− > 0) *s++ = 0; return os; } // Like strncpy but guaranteed to NUL−terminate. char* safestrcpy(char *s, const char *t, int n) { char *os; os = s; if(n <= 0) return os; while(−−n > 0 && (*s++ = *t++) != 0) ; *s = 0; return os; } Sheet 60 int strlen(const char *s) { int n; for(n = 0; s[n]; n++) ; return n; } Sheet 60 Sep 5 23:39 2011 xv6/mp.h Page 1 Sep 5 23:39 2011 xv6/mp.h Page 2 6100 6101 6102 6103 6104 6105 6106 6107 6108 6109 6110 6111 6112 6113 6114 6115 6116 6117 6118 6119 6120 6121 6122 6123 6124 6125 6126 6127 6128 6129 6130 6131 6132 6133 6134 6135 6136 6137 6138 6139 6140 6141 6142 6143 6144 6145 6146 6147 6148 6149 6150 6151 6152 6153 6154 6155 6156 6157 6158 6159 6160 6161 6162 6163 6164 6165 6166 6167 6168 6169 6170 6171 6172 6173 6174 6175 6176 6177 6178 6179 6180 6181 6182 6183 6184 6185 6186 6187 6188 6189 6190 6191 6192 6193 6194 6195 6196 6197 6198 6199 // See MultiProcessor Specification Version 1.[14] struct mp { uchar signature[4]; void *physaddr; uchar length; uchar specrev; uchar checksum; uchar type; uchar imcrp; uchar reserved[3]; }; // floating pointer // "_MP_" // phys addr of MP config table // 1 // [14] // all bytes must add up to 0 // MP system config type struct mpconf { uchar signature[4]; ushort length; uchar version; uchar checksum; uchar product[20]; uint *oemtable; ushort oemlength; ushort entry; uint *lapicaddr; ushort xlength; uchar xchecksum; uchar reserved; }; // configuration table header // "PCMP" // total table length // [14] // all bytes must add up to 0 // product id // OEM table pointer // OEM table length // entry count // address of local APIC // extended table length // extended table checksum struct mpproc { // processor table entry uchar type; // entry type (0) uchar apicid; // local APIC id uchar version; // local APIC verison uchar flags; // CPU flags #define MPBOOT 0x02 // This proc is the bootstrap processor. uchar signature[4]; // CPU signature uint feature; // feature flags from CPUID instruction uchar reserved[8]; }; struct mpioapic { uchar type; uchar apicno; uchar version; uchar flags; uint *addr; }; Sheet 61 // I/O APIC table entry // entry type (2) // I/O APIC id // I/O APIC version // I/O APIC flags // I/O APIC address // Table entry types #define MPPROC 0x00 #define MPBUS 0x01 #define MPIOAPIC 0x02 #define MPIOINTR 0x03 #define MPLINTR 0x04 Sheet 61 // // // // // One One One One One per per per per per processor bus I/O APIC bus interrupt source system interrupt source Sep 5 23:39 2011 xv6/mp.c Page 1 Sep 5 23:39 2011 xv6/mp.c Page 2 6200 6201 6202 6203 6204 6205 6206 6207 6208 6209 6210 6211 6212 6213 6214 6215 6216 6217 6218 6219 6220 6221 6222 6223 6224 6225 6226 6227 6228 6229 6230 6231 6232 6233 6234 6235 6236 6237 6238 6239 6240 6241 6242 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 6261 6262 6263 6264 6265 6266 6267 6268 6269 6270 6271 6272 6273 6274 6275 6276 6277 6278 6279 6280 6281 6282 6283 6284 6285 6286 6287 6288 6289 6290 6291 6292 6293 6294 6295 6296 6297 6298 6299 // Multiprocessor support // Search memory for MP description structures. // http://developer.intel.com/design/pentium/datashts/24201606.pdf #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "memlayout.h" "mp.h" "x86.h" "mmu.h" "proc.h" struct cpu cpus[NCPU]; static struct cpu *bcpu; int ismp; int ncpu; uchar ioapicid; int mpbcpu(void) { return bcpu−cpus; } static uchar sum(uchar *addr, int len) { int i, sum; sum = 0; for(i=0; iphysaddr == 0) return 0; conf = (struct mpconf*) p2v((uint) mp−>physaddr); if(memcmp(conf, "PCMP", 4) != 0) return 0; if(conf−>version != 1 && conf−>version != 4) return 0; if(sum((uchar*)conf, conf−>length) != 0) return 0; *pmp = mp; return conf; } Sheet 62 Sep 5 23:39 2011 xv6/mp.c Page 3 Sep 5 23:39 2011 xv6/mp.c Page 4 6300 void 6301 mpinit(void) 6302 { 6303 uchar *p, *e; 6304 struct mp *mp; 6305 struct mpconf *conf; 6306 struct mpproc *proc; 6307 struct mpioapic *ioapic; 6308 6309 bcpu = &cpus[0]; 6310 if((conf = mpconfig(&mp)) == 0) 6311 return; 6312 ismp = 1; 6313 lapic = (uint*)conf−>lapicaddr; 6314 for(p=(uchar*)(conf+1), e=(uchar*)conf+conf−>length; papicid){ 6319 cprintf("mpinit: ncpu=%d apicid=%d\n", ncpu, proc−>apicid); 6320 ismp = 0; 6321 } 6322 if(proc−>flags & MPBOOT) 6323 bcpu = &cpus[ncpu]; 6324 cpus[ncpu].id = ncpu; 6325 ncpu++; 6326 p += sizeof(struct mpproc); 6327 continue; 6328 case MPIOAPIC: 6329 ioapic = (struct mpioapic*)p; 6330 ioapicid = ioapic−>apicno; 6331 p += sizeof(struct mpioapic); 6332 continue; 6333 case MPBUS: 6334 case MPIOINTR: 6335 case MPLINTR: 6336 p += 8; 6337 continue; 6338 default: 6339 cprintf("mpinit: unknown config type %x\n", *p); 6340 ismp = 0; 6341 } 6342 } 6343 if(!ismp){ 6344 // Didn’t like what we found; fall back to no MP. 6345 ncpu = 1; 6346 lapic = 0; 6347 ioapicid = 0; 6348 return; 6349 } 6350 if(mp−>imcrp){ 6351 // Bochs doesn’t support IMCR, so this doesn’t run on Bochs. 6352 // But it would on real hardware. 6353 outb(0x22, 0x70); // Select IMCR 6354 outb(0x23, inb(0x23) | 1); // Mask external interrupts. 6355 } 6356 } 6357 6358 6359 6360 6361 6362 6363 6364 6365 6366 6367 6368 6369 6370 6371 6372 6373 6374 6375 6376 6377 6378 6379 6380 6381 6382 6383 6384 6385 6386 6387 6388 6389 6390 6391 6392 6393 6394 6395 6396 6397 6398 6399 Sheet 63 Sheet 63 Sep 5 23:39 2011 xv6/lapic.c Page 1 Sep 5 23:39 2011 xv6/lapic.c Page 2 6400 6401 6402 6403 6404 6405 6406 6407 6408 6409 6410 6411 6412 6413 6414 6415 6416 6417 6418 6419 6420 6421 6422 6423 6424 6425 6426 6427 6428 6429 6430 6431 6432 6433 6434 6435 6436 6437 6438 6439 6440 6441 6442 6443 6444 6445 6446 6447 6448 6449 6450 6451 6452 6453 6454 6455 6456 6457 6458 6459 6460 6461 6462 6463 6464 6465 6466 6467 6468 6469 6470 6471 6472 6473 6474 6475 6476 6477 6478 6479 6480 6481 6482 6483 6484 6485 6486 6487 6488 6489 6490 6491 6492 6493 6494 6495 6496 6497 6498 6499 // The local APIC manages internal (non−I/O) interrupts. // See Chapter 8 & Appendix C of Intel processor manual volume 3. #include #include #include #include #include #include "types.h" "defs.h" "memlayout.h" "traps.h" "mmu.h" "x86.h" // Local APIC registers, divided by 4 for use as uint[] indices. #define ID (0x0020/4) // ID #define VER (0x0030/4) // Version #define TPR (0x0080/4) // Task Priority #define EOI (0x00B0/4) // EOI #define SVR (0x00F0/4) // Spurious Interrupt Vector #define ENABLE 0x00000100 // Unit Enable #define ESR (0x0280/4) // Error Status #define ICRLO (0x0300/4) // Interrupt Command #define INIT 0x00000500 // INIT/RESET #define STARTUP 0x00000600 // Startup IPI #define DELIVS 0x00001000 // Delivery status #define ASSERT 0x00004000 // Assert interrupt (vs deassert) #define DEASSERT 0x00000000 #define LEVEL 0x00008000 // Level triggered #define BCAST 0x00080000 // Send to all APICs, including self. #define BUSY 0x00001000 #define FIXED 0x00000000 #define ICRHI (0x0310/4) // Interrupt Command [63:32] #define TIMER (0x0320/4) // Local Vector Table 0 (TIMER) #define X1 0x0000000B // divide counts by 1 #define PERIODIC 0x00020000 // Periodic #define PCINT (0x0340/4) // Performance Counter LVT #define LINT0 (0x0350/4) // Local Vector Table 1 (LINT0) #define LINT1 (0x0360/4) // Local Vector Table 2 (LINT1) #define ERROR (0x0370/4) // Local Vector Table 3 (ERROR) #define MASKED 0x00010000 // Interrupt masked #define TICR (0x0380/4) // Timer Initial Count #define TCCR (0x0390/4) // Timer Current Count #define TDCR (0x03E0/4) // Timer Divide Configuration volatile uint *lapic; // Initialized in mp.c static void lapicw(int index, int value) { lapic[index] = value; lapic[ID]; // wait for write to finish, by reading } Sheet 64 void lapicinit(int c) { if(!lapic) return; // Enable local APIC; set spurious interrupt vector. lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS)); // The timer repeatedly counts down at bus frequency // from lapic[TICR] and then issues an interrupt. // If xv6 cared more about precise timekeeping, // TICR would be calibrated using an external time source. lapicw(TDCR, X1); lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER)); lapicw(TICR, 10000000); // Disable logical interrupt lines. lapicw(LINT0, MASKED); lapicw(LINT1, MASKED); // Disable performance counter overflow interrupts // on machines that provide that interrupt entry. if(((lapic[VER]>>16) & 0xFF) >= 4) lapicw(PCINT, MASKED); // Map error interrupt to IRQ_ERROR. lapicw(ERROR, T_IRQ0 + IRQ_ERROR); // Clear error status register (requires back−to−back writes). lapicw(ESR, 0); lapicw(ESR, 0); // Ack any outstanding interrupts. lapicw(EOI, 0); // Send an Init Level De−Assert to synchronise arbitration ID’s. lapicw(ICRHI, 0); lapicw(ICRLO, BCAST | INIT | LEVEL); while(lapic[ICRLO] & DELIVS) ; // Enable interrupts on the APIC (but not on the processor). lapicw(TPR, 0); } Sheet 64 Sep 5 23:39 2011 xv6/lapic.c Page 3 Sep 5 23:39 2011 xv6/lapic.c Page 4 6500 6501 6502 6503 6504 6505 6506 6507 6508 6509 6510 6511 6512 6513 6514 6515 6516 6517 6518 6519 6520 6521 6522 6523 6524 6525 6526 6527 6528 6529 6530 6531 6532 6533 6534 6535 6536 6537 6538 6539 6540 6541 6542 6543 6544 6545 6546 6547 6548 6549 6550 6551 6552 6553 6554 6555 6556 6557 6558 6559 6560 6561 6562 6563 6564 6565 6566 6567 6568 6569 6570 6571 6572 } 6573 6574 6575 6576 6577 6578 6579 6580 6581 6582 6583 6584 6585 6586 6587 6588 6589 6590 6591 6592 6593 6594 6595 6596 6597 6598 6599 int cpunum(void) { // Cannot call cpu when interrupts are enabled: // result not guaranteed to last long enough to be used! // Would prefer to panic but even printing is chancy here: // almost everything, including cprintf and panic, calls cpu, // often indirectly through acquire and release. if(readeflags()&FL_IF){ static int n; if(n++ == 0) cprintf("cpu called from %x with interrupts enabled\n", __builtin_return_address(0)); } if(lapic) return lapic[ID]>>24; return 0; } // Acknowledge interrupt. void lapiceoi(void) { if(lapic) lapicw(EOI, 0); } // Spin for a given number of microseconds. // On real hardware would want to tune this dynamically. void microdelay(int us) { } #define IO_RTC 0x70 // Start additional processor running entry code at addr. // See Appendix B of MultiProcessor Specification. void lapicstartap(uchar apicid, uint addr) { int i; ushort *wrv; // "The BSP must initialize CMOS shutdown code to 0AH // and the warm reset vector (DWORD based at 40:67) to point at // the AP startup code prior to the [universal startup algorithm]." outb(IO_RTC, 0xF); // offset 0xF is shutdown code outb(IO_RTC+1, 0x0A); Sheet 65 wrv = (ushort*)P2V((0x40<<4 | 0x67)); // Warm reset vector wrv[0] = 0; wrv[1] = addr >> 4; // "Universal startup algorithm." // Send INIT (level−triggered) interrupt to reset other CPU. lapicw(ICRHI, apicid<<24); lapicw(ICRLO, INIT | LEVEL | ASSERT); microdelay(200); lapicw(ICRLO, INIT | LEVEL); microdelay(100); // should be 10ms, but too slow in Bochs! // Send startup IPI (twice!) to enter code. // Regular hardware is supposed to only accept a STARTUP // when it is in the halted state due to an INIT. So the second // should be ignored, but it is part of the official Intel algorithm. // Bochs complains about the second one. Too bad for Bochs. for(i = 0; i < 2; i++){ lapicw(ICRHI, apicid<<24); lapicw(ICRLO, STARTUP | (addr>>12)); microdelay(200); } Sheet 65 Sep 5 23:39 2011 xv6/ioapic.c Page 1 Sep 5 23:39 2011 xv6/ioapic.c Page 2 6600 6601 6602 6603 6604 6605 6606 6607 6608 6609 6610 6611 6612 6613 6614 6615 6616 6617 6618 6619 6620 6621 6622 6623 6624 6625 6626 6627 6628 6629 6630 6631 6632 6633 6634 6635 6636 6637 6638 6639 6640 6641 6642 6643 6644 6645 6646 6647 6648 6649 6650 6651 6652 6653 6654 6655 6656 6657 6658 6659 6660 6661 6662 6663 6664 6665 6666 6667 6668 6669 6670 6671 6672 6673 6674 6675 6676 6677 6678 6679 6680 6681 6682 6683 6684 6685 6686 6687 6688 6689 6690 6691 6692 6693 6694 6695 6696 6697 6698 6699 // The I/O APIC manages hardware interrupts for an SMP system. // http://www.intel.com/design/chipsets/datashts/29056601.pdf // See also picirq.c. #include "types.h" #include "defs.h" #include "traps.h" #define IOAPIC 0xFEC00000 // Default physical address of IO APIC #define REG_ID 0x00 // Register index: ID #define REG_VER 0x01 // Register index: version #define REG_TABLE 0x10 // Redirection table base // The redirection table starts at REG_TABLE and uses // two registers to configure each interrupt. // The first (low) register in a pair contains configuration bits. // The second (high) register contains a bitmask telling which // CPUs can serve that interrupt. #define INT_DISABLED 0x00010000 // Interrupt disabled #define INT_LEVEL 0x00008000 // Level−triggered (vs edge−) #define INT_ACTIVELOW 0x00002000 // Active low (vs high) #define INT_LOGICAL 0x00000800 // Destination is CPU id (vs APIC ID) volatile struct ioapic *ioapic; // IO APIC MMIO structure: write reg, then read or write data. struct ioapic { uint reg; uint pad[3]; uint data; }; static uint ioapicread(int reg) { ioapic−>reg = reg; return ioapic−>data; } static void ioapicwrite(int reg, uint data) { ioapic−>reg = reg; ioapic−>data = data; } Sheet 66 void ioapicinit(void) { int i, id, maxintr; if(!ismp) return; ioapic = (volatile struct ioapic*)IOAPIC; maxintr = (ioapicread(REG_VER) >> 16) & 0xFF; id = ioapicread(REG_ID) >> 24; if(id != ioapicid) cprintf("ioapicinit: id isn’t equal to ioapicid; not a MP\n"); // Mark all interrupts edge−triggered, active high, disabled, // and not routed to any CPUs. for(i = 0; i <= maxintr; i++){ ioapicwrite(REG_TABLE+2*i, INT_DISABLED | (T_IRQ0 + i)); ioapicwrite(REG_TABLE+2*i+1, 0); } } void ioapicenable(int irq, int cpunum) { if(!ismp) return; // Mark interrupt edge−triggered, active high, // enabled, and routed to the given cpunum, // which happens to be that cpu’s APIC ID. ioapicwrite(REG_TABLE+2*irq, T_IRQ0 + irq); ioapicwrite(REG_TABLE+2*irq+1, cpunum << 24); } Sheet 66 Sep 5 23:39 2011 xv6/picirq.c Page 1 Sep 5 23:39 2011 xv6/picirq.c Page 2 6700 6701 6702 6703 6704 6705 6706 6707 6708 6709 6710 6711 6712 6713 6714 6715 6716 6717 6718 6719 6720 6721 6722 6723 6724 6725 6726 6727 6728 6729 6730 6731 6732 6733 6734 6735 6736 6737 6738 6739 6740 6741 6742 6743 6744 6745 6746 6747 6748 6749 6750 6751 6752 6753 6754 6755 6756 6757 6758 6759 6760 6761 6762 6763 6764 6765 6766 6767 6768 6769 6770 6771 6772 6773 6774 6775 6776 6777 6778 6779 6780 6781 6782 6783 6784 } 6785 6786 6787 6788 6789 6790 6791 6792 6793 6794 6795 6796 6797 6798 6799 // Intel 8259A programmable interrupt controllers. #include "types.h" #include "x86.h" #include "traps.h" // I/O Addresses of the two programmable interrupt controllers #define IO_PIC1 0x20 // Master (IRQs 0−7) #define IO_PIC2 0xA0 // Slave (IRQs 8−15) #define IRQ_SLAVE 2 // IRQ at which slave connects to master // Current IRQ mask. // Initial IRQ mask has interrupt 2 enabled (for slave 8259A). static ushort irqmask = 0xFFFF & ~(1<> 8); } void picenable(int irq) { picsetmask(irqmask & ~(1<’, ’?’, NO, ’ ’, NO, NO, NO, NO, NO, NO, NO, NO, NO, NO, ’8’, ’9’, ’−’, ’4’, ’5’, ’6’, ’2’, ’3’, ’0’, ’.’, NO, NO, [0x9C] ’\n’, // KP_Enter [0xB5] ’/’, // KP_Div [0xC8] KEY_UP, [0xD0] KEY_DN, [0xC9] KEY_PGUP, [0xD1] KEY_PGDN, [0xCB] KEY_LF, [0xCD] KEY_RT, [0x97] KEY_HOME, [0xCF] KEY_END, [0xD2] KEY_INS, [0xD3] KEY_DEL }; Sheet 68 ’5’, ’\b’, ’u’, ’a’, ’l’, ’c’, NO, NO, NO, ’+’, NO, ’6’, ’\t’, ’i’, ’s’, ’;’, ’v’, ’*’, NO, ’7’, ’1’, NO, // 0x00 ’%’, ’\b’, ’U’, ’A’, ’L’, ’C’, NO, NO, NO, ’+’, NO, ’^’, ’\t’, ’I’, ’S’, ’:’, ’V’, ’*’, NO, ’7’, ’1’, NO, // 0x00 // 0x10 // 0x20 // 0x30 // 0x40 // 0x50 // 0x10 // 0x20 // 0x30 // 0x40 // 0x50 Sep 5 23:39 2011 xv6/kbd.h Page 3 Sep 5 23:39 2011 xv6/kbd.c Page 1 6900 static uchar ctlmap[256] = 6901 { 6902 NO, NO, NO, NO, 6903 NO, NO, NO, NO, 6904 C(’Q’), C(’W’), C(’E’), C(’R’), 6905 C(’O’), C(’P’), NO, NO, 6906 C(’D’), C(’F’), C(’G’), C(’H’), 6907 NO, NO, NO, C(’\\’), 6908 C(’B’), C(’N’), C(’M’), NO, 6909 [0x9C] ’\r’, // KP_Enter 6910 [0xB5] C(’/’), // KP_Div 6911 [0xC8] KEY_UP, [0xD0] KEY_DN, 6912 [0xC9] KEY_PGUP, [0xD1] KEY_PGDN, 6913 [0xCB] KEY_LF, [0xCD] KEY_RT, 6914 [0x97] KEY_HOME, [0xCF] KEY_END, 6915 [0xD2] KEY_INS, [0xD3] KEY_DEL 6916 }; 6917 6918 6919 6920 6921 6922 6923 6924 6925 6926 6927 6928 6929 6930 6931 6932 6933 6934 6935 6936 6937 6938 6939 6940 6941 6942 6943 6944 6945 6946 6947 6948 6949 6950 6951 6952 6953 6954 6955 6956 6957 6958 6959 6960 6961 6962 6963 6964 6965 6966 6967 6968 6969 6970 6971 6972 6973 6974 6975 6976 6977 6978 6979 6980 6981 6982 6983 6984 6985 6986 6987 6988 6989 6990 6991 6992 6993 6994 6995 6996 6997 6998 6999 Sheet 69 NO, NO, C(’T’), ’\r’, C(’J’), C(’Z’), NO, NO, NO, C(’Y’), NO, C(’K’), C(’X’), C(’/’), NO, NO, C(’U’), C(’A’), C(’L’), C(’C’), NO, NO, NO, C(’I’), C(’S’), NO, C(’V’), NO, #include #include #include #include "types.h" "x86.h" "defs.h" "kbd.h" int kbdgetc(void) { static uint shift; static uchar *charcode[4] = { normalmap, shiftmap, ctlmap, ctlmap }; uint st, data, c; st = inb(KBSTATP); if((st & KBS_DIB) == 0) return −1; data = inb(KBDATAP); if(data == 0xE0){ shift |= E0ESC; return 0; } else if(data & 0x80){ // Key released data = (shift & E0ESC ? data : data & 0x7F); shift &= ~(shiftcode[data] | E0ESC); return 0; } else if(shift & E0ESC){ // Last character was an E0 escape; or with 0x80 data |= 0x80; shift &= ~E0ESC; } shift |= shiftcode[data]; shift ^= togglecode[data]; c = charcode[shift & (CTL | SHIFT)][data]; if(shift & CAPSLOCK){ if(’a’ <= c && c <= ’z’) c += ’A’ − ’a’; else if(’A’ <= c && c <= ’Z’) c += ’a’ − ’A’; } return c; } void kbdintr(void) { consoleintr(kbdgetc); } Sheet 69 Sep 5 23:39 2011 xv6/console.c Page 1 Sep 5 23:39 2011 xv6/console.c Page 2 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013 7014 7015 7016 7017 7018 7019 7020 7021 7022 7023 7024 7025 7026 7027 7028 7029 7030 7031 7032 7033 7034 7035 7036 7037 7038 7039 7040 7041 7042 7043 7044 7045 7046 7047 7048 7049 7050 7051 7052 7053 7054 7055 7056 7057 7058 7059 7060 7061 7062 7063 7064 7065 7066 7067 7068 7069 7070 7071 7072 7073 7074 7075 7076 7077 7078 7079 7080 7081 7082 7083 7084 7085 7086 7087 7088 7089 7090 7091 7092 7093 7094 7095 7096 7097 7098 7099 // Console input and output. // Input is from the keyboard or serial port. // Output is written to the screen and serial port. #include #include #include #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "traps.h" "spinlock.h" "fs.h" "file.h" "memlayout.h" "mmu.h" "proc.h" "x86.h" static void consputc(int); static int panicked = 0; static struct { struct spinlock lock; int locking; } cons; static void printint(int xx, int base, int sign) { static char digits[] = "0123456789abcdef"; char buf[16]; int i; uint x; if(sign && (sign = xx < 0)) x = −xx; else x = xx; i = 0; do{ buf[i++] = digits[x % base]; }while((x /= base) != 0); if(sign) buf[i++] = ’−’; while(−−i >= 0) consputc(buf[i]); } Sheet 70 // Print to the console. only understands %d, %x, %p, %s. void cprintf(char *fmt, ...) { int i, c, state, locking; uint *argp; char *s; locking = cons.locking; if(locking) acquire(&cons.lock); if (fmt == 0) panic("null fmt"); argp = (uint*)(void*)(&fmt + 1); state = 0; for(i = 0; (c = fmt[i] & 0xff) != 0; i++){ if(c != ’%’){ consputc(c); continue; } c = fmt[++i] & 0xff; if(c == 0) break; switch(c){ case ’d’: printint(*argp++, 10, 1); break; case ’x’: case ’p’: printint(*argp++, 16, 0); break; case ’s’: if((s = (char*)*argp++) == 0) s = "(null)"; for(; *s; s++) consputc(*s); break; case ’%’: consputc(’%’); break; default: // Print unknown % sequence to draw attention. consputc(’%’); consputc(c); break; } } Sheet 70 Sep 5 23:39 2011 xv6/console.c Page 3 Sep 5 23:39 2011 xv6/console.c Page 4 7100 7101 7102 7103 7104 7105 7106 7107 7108 7109 7110 7111 7112 7113 7114 7115 7116 7117 7118 7119 7120 7121 7122 7123 7124 7125 7126 7127 7128 7129 7130 7131 7132 7133 7134 7135 7136 7137 7138 7139 7140 7141 7142 7143 7144 7145 7146 7147 7148 7149 7150 7151 7152 7153 7154 7155 7156 7157 7158 7159 7160 7161 7162 7163 7164 7165 7166 7167 7168 7169 7170 7171 7172 7173 7174 7175 7176 7177 7178 7179 7180 7181 7182 7183 7184 7185 7186 7187 7188 7189 7190 7191 7192 7193 7194 7195 7196 7197 7198 7199 if(locking) release(&cons.lock); } void panic(char *s) { int i; uint pcs[10]; cli(); cons.locking = 0; cprintf("cpu%d: panic: ", cpu−>id); cprintf(s); cprintf("\n"); getcallerpcs(&s, pcs); for(i=0; i<10; i++) cprintf(" %p", pcs[i]); panicked = 1; // freeze other CPU for(;;) ; } Sheet 71 #define BACKSPACE 0x100 #define CRTPORT 0x3d4 static ushort *crt = (ushort*)P2V(0xb8000); // CGA memory static void cgaputc(int c) { int pos; // Cursor position: col + 80*row. outb(CRTPORT, 14); pos = inb(CRTPORT+1) << 8; outb(CRTPORT, 15); pos |= inb(CRTPORT+1); if(c == ’\n’) pos += 80 − pos%80; else if(c == BACKSPACE){ if(pos > 0) −−pos; } else crt[pos++] = (c&0xff) | 0x0700; // black on white if((pos/80) >= 24){ // Scroll up. memmove(crt, crt+80, sizeof(crt[0])*23*80); pos −= 80; memset(crt+pos, 0, sizeof(crt[0])*(24*80 − pos)); } outb(CRTPORT, 14); outb(CRTPORT+1, pos>>8); outb(CRTPORT, 15); outb(CRTPORT+1, pos); crt[pos] = ’ ’ | 0x0700; } void consputc(int c) { if(panicked){ cli(); for(;;) ; } if(c == BACKSPACE){ uartputc(’\b’); uartputc(’ ’); uartputc(’\b’); } else uartputc(c); cgaputc(c); } Sheet 71 Sep 5 23:39 2011 xv6/console.c Page 5 Sep 5 23:39 2011 xv6/console.c Page 6 7200 7201 7202 7203 7204 7205 7206 7207 7208 7209 7210 7211 7212 7213 7214 7215 7216 7217 7218 7219 7220 7221 7222 7223 7224 7225 7226 7227 7228 7229 7230 7231 7232 7233 7234 7235 7236 7237 7238 7239 7240 7241 7242 7243 7244 7245 7246 7247 7248 7249 7250 7251 7252 7253 7254 7255 7256 7257 7258 7259 7260 7261 7262 7263 7264 7265 7266 7267 7268 7269 7270 7271 7272 7273 7274 7275 7276 7277 7278 7279 7280 7281 7282 7283 7284 7285 7286 7287 7288 7289 7290 7291 7292 7293 7294 7295 7296 7297 7298 7299 #define INPUT_BUF 128 struct { struct spinlock lock; char buf[INPUT_BUF]; uint r; // Read index uint w; // Write index uint e; // Edit index } input; #define C(x) ((x)−’@’) // Control−x void consoleintr(int (*getc)(void)) { int c; acquire(&input.lock); while((c = getc()) >= 0){ switch(c){ case C(’P’): // Process listing. procdump(); break; case C(’U’): // Kill line. while(input.e != input.w && input.buf[(input.e−1) % INPUT_BUF] != ’\n’){ input.e−−; consputc(BACKSPACE); } break; case C(’H’): case ’\x7f’: // Backspace if(input.e != input.w){ input.e−−; consputc(BACKSPACE); } break; default: if(c != 0 && input.e−input.r < INPUT_BUF){ c = (c == ’\r’) ? ’\n’ : c; input.buf[input.e++ % INPUT_BUF] = c; consputc(c); if(c == ’\n’ || c == C(’D’) || input.e == input.r+INPUT_BUF){ input.w = input.e; wakeup(&input.r); } } break; } } release(&input.lock); } Sheet 72 int consoleread(struct inode *ip, char *dst, int n) { uint target; int c; iunlock(ip); target = n; acquire(&input.lock); while(n > 0){ while(input.r == input.w){ if(proc−>killed){ release(&input.lock); ilock(ip); return −1; } sleep(&input.r, &input.lock); } c = input.buf[input.r++ % INPUT_BUF]; if(c == C(’D’)){ // EOF if(n < target){ // Save ^D for next time, to make sure // caller gets a 0−byte result. input.r−−; } break; } *dst++ = c; −−n; if(c == ’\n’) break; } release(&input.lock); ilock(ip); return target − n; } Sheet 72 Sep 5 23:39 2011 xv6/console.c Page 7 Sep 5 23:39 2011 xv6/timer.c Page 1 7300 7301 7302 7303 7304 7305 7306 7307 7308 7309 7310 7311 7312 7313 7314 7315 7316 7317 7318 7319 7320 7321 7322 7323 7324 7325 7326 7327 7328 7329 7330 7331 7332 7333 7334 7335 7336 7337 7338 7339 7340 7341 7342 7343 7344 7345 7346 7347 7348 7349 7350 7351 7352 7353 7354 7355 7356 7357 7358 7359 7360 7361 7362 7363 7364 7365 7366 7367 7368 7369 7370 7371 7372 7373 7374 7375 7376 7377 7378 7379 7380 7381 7382 7383 7384 7385 7386 7387 7388 7389 7390 7391 7392 7393 7394 7395 7396 7397 7398 7399 int consolewrite(struct inode *ip, char *buf, int n) { int i; iunlock(ip); acquire(&cons.lock); for(i = 0; i < n; i++) consputc(buf[i] & 0xff); release(&cons.lock); ilock(ip); return n; } void consoleinit(void) { initlock(&cons.lock, "console"); initlock(&input.lock, "input"); devsw[CONSOLE].write = consolewrite; devsw[CONSOLE].read = consoleread; cons.locking = 1; picenable(IRQ_KBD); ioapicenable(IRQ_KBD, 0); } Sheet 73 // Intel 8253/8254/82C54 Programmable Interval Timer (PIT). // Only used on uniprocessors; // SMP machines use the local APIC timer. #include #include #include #include "types.h" "defs.h" "traps.h" "x86.h" #define IO_TIMER1 0x040 // 8253 Timer #1 // Frequency of all three count−down timers; // (TIMER_FREQ/freq) is the appropriate count // to generate a frequency of freq Hz. #define TIMER_FREQ #define TIMER_DIV(x) 1193182 ((TIMER_FREQ+(x)/2)/(x)) #define #define #define #define (IO_TIMER1 0x00 // 0x04 // 0x30 // TIMER_MODE TIMER_SEL0 TIMER_RATEGEN TIMER_16BIT + 3) // timer mode port select counter 0 mode 2, rate generator r/w counter 16 bits, LSB first void timerinit(void) { // Interrupt 100 times/sec. outb(TIMER_MODE, TIMER_SEL0 | TIMER_RATEGEN | TIMER_16BIT); outb(IO_TIMER1, TIMER_DIV(100) % 256); outb(IO_TIMER1, TIMER_DIV(100) / 256); picenable(IRQ_TIMER); } Sheet 73 Sep 5 23:39 2011 xv6/uart.c Page 1 Sep 5 23:39 2011 xv6/uart.c Page 2 7400 7401 7402 7403 7404 7405 7406 7407 7408 7409 7410 7411 7412 7413 7414 7415 7416 7417 7418 7419 7420 7421 7422 7423 7424 7425 7426 7427 7428 7429 7430 7431 7432 7433 7434 7435 7436 7437 7438 7439 7440 7441 7442 7443 7444 7445 7446 7447 7448 7449 7450 7451 7452 7453 7454 7455 7456 7457 7458 7459 7460 7461 7462 7463 7464 7465 7466 7467 7468 7469 7470 7471 7472 7473 7474 7475 7476 7477 7478 7479 7480 7481 7482 7483 7484 7485 7486 7487 7488 7489 7490 7491 7492 7493 7494 7495 7496 7497 7498 7499 // Intel 8250 serial port (UART). #include #include #include #include #include #include #include #include #include #include "types.h" "defs.h" "param.h" "traps.h" "spinlock.h" "fs.h" "file.h" "mmu.h" "proc.h" "x86.h" #define COM1 static int uart; 0x3f8 // is there a uart? void uartinit(void) { char *p; // Turn off the FIFO outb(COM1+2, 0); // 9600 baud, 8 data bits, 1 stop bit, parity off. outb(COM1+3, 0x80); // Unlock divisor outb(COM1+0, 115200/9600); outb(COM1+1, 0); outb(COM1+3, 0x03); // Lock divisor, 8 data bits. outb(COM1+4, 0); outb(COM1+1, 0x01); // Enable receive interrupts. // If status is 0xFF, no serial port. if(inb(COM1+5) == 0xFF) return; uart = 1; // Acknowledge pre−existing interrupt conditions; // enable interrupts. inb(COM1+2); inb(COM1+0); picenable(IRQ_COM1); ioapicenable(IRQ_COM1, 0); // Announce that we’re here. for(p="xv6...\n"; *p; p++) uartputc(*p); } Sheet 74 void uartputc(int c) { int i; if(!uart) return; for(i = 0; i < 128 && !(inb(COM1+5) & 0x20); i++) microdelay(10); outb(COM1+0, c); } static int uartgetc(void) { if(!uart) return −1; if(!(inb(COM1+5) & 0x01)) return −1; return inb(COM1+0); } void uartintr(void) { consoleintr(uartgetc); } Sheet 74 Sep 5 23:39 2011 xv6/initcode.S Page 1 Sep 5 23:39 2011 xv6/usys.S Page 1 7500 7501 7502 7503 7504 7505 7506 7507 7508 7509 7510 7511 7512 7513 7514 7515 7516 7517 7518 7519 7520 7521 7522 7523 7524 7525 7526 7527 7528 7529 7530 7531 7532 7533 7534 7535 7536 7537 7538 7539 7540 7541 7542 7543 7544 7545 7546 7547 7548 7549 7550 7551 7552 7553 7554 7555 7556 7557 7558 7559 7560 7561 7562 7563 7564 7565 7566 7567 7568 7569 7570 7571 7572 7573 7574 7575 7576 7577 7578 7579 7580 7581 7582 7583 7584 7585 7586 7587 7588 7589 7590 7591 7592 7593 7594 7595 7596 7597 7598 7599 # Initial process execs /init. #include "syscall.h" #include "traps.h" # exec(init, argv) .globl start start: pushl $argv pushl $init pushl $0 // where caller pc would be movl $SYS_exec, %eax int $T_SYSCALL # for(;;) exit(); exit: movl $SYS_exit, %eax int $T_SYSCALL jmp exit # char init[] = "/init\0"; init: .string "/init\0" # char *argv[] = { init, 0 }; .p2align 2 argv: .long init .long 0 Sheet 75 #include "syscall.h" #include "traps.h" #define SYSCALL(name) \ .globl name; \ name: \ movl $SYS_ ## name, %eax; \ int $T_SYSCALL; \ ret SYSCALL(fork) SYSCALL(exit) SYSCALL(wait) SYSCALL(pipe) SYSCALL(read) SYSCALL(write) SYSCALL(close) SYSCALL(kill) SYSCALL(exec) SYSCALL(open) SYSCALL(mknod) SYSCALL(unlink) SYSCALL(fstat) SYSCALL(link) SYSCALL(mkdir) SYSCALL(chdir) SYSCALL(dup) SYSCALL(getpid) SYSCALL(sbrk) SYSCALL(sleep) SYSCALL(uptime) Sheet 75 Sep 5 23:39 2011 xv6/init.c Page 1 Sep 5 23:39 2011 xv6/sh.c Page 1 7600 7601 7602 7603 7604 7605 7606 7607 7608 7609 7610 7611 7612 7613 7614 7615 7616 7617 7618 7619 7620 7621 7622 7623 7624 7625 7626 7627 7628 7629 7630 7631 7632 7633 7634 7635 7636 7637 7638 7639 7640 7641 7642 7643 7644 7645 7646 7647 7648 7649 7650 7651 7652 7653 7654 7655 7656 7657 7658 7659 7660 7661 7662 7663 7664 7665 7666 7667 7668 7669 7670 7671 7672 7673 7674 7675 7676 7677 7678 7679 7680 7681 7682 7683 7684 7685 7686 7687 7688 7689 7690 7691 7692 7693 7694 7695 7696 7697 7698 7699 // init: The initial user−level program #include #include #include #include "types.h" "stat.h" "user.h" "fcntl.h" char *argv[] = { "sh", 0 }; int main(void) { int pid, wpid; if(open("console", O_RDWR) < 0){ mknod("console", 1, 1); open("console", O_RDWR); } dup(0); // stdout dup(0); // stderr for(;;){ printf(1, "init: starting sh\n"); pid = fork(); if(pid < 0){ printf(1, "init: fork failed\n"); exit(); } if(pid == 0){ exec("sh", argv); printf(1, "init: exec sh failed\n"); exit(); } while((wpid=wait()) >= 0 && wpid != pid) printf(1, "zombie!\n"); } } Sheet 76 // Shell. #include "types.h" #include "user.h" #include "fcntl.h" // Parsed command representation #define EXEC 1 #define REDIR 2 #define PIPE 3 #define LIST 4 #define BACK 5 #define MAXARGS 10 struct cmd { int type; }; struct execcmd { int type; char *argv[MAXARGS]; char *eargv[MAXARGS]; }; struct redircmd { int type; struct cmd *cmd; char *file; char *efile; int mode; int fd; }; struct pipecmd { int type; struct cmd *left; struct cmd *right; }; struct listcmd { int type; struct cmd *left; struct cmd *right; }; struct backcmd { int type; struct cmd *cmd; }; Sheet 76 Sep 5 23:39 2011 xv6/sh.c Page 2 Sep 5 23:39 2011 xv6/sh.c Page 3 7700 7701 7702 7703 7704 7705 7706 7707 7708 7709 7710 7711 7712 7713 7714 7715 7716 7717 7718 7719 7720 7721 7722 7723 7724 7725 7726 7727 7728 7729 7730 7731 7732 7733 7734 7735 7736 7737 7738 7739 7740 7741 7742 7743 7744 7745 7746 7747 7748 7749 7750 7751 7752 7753 7754 7755 7756 7757 7758 7759 7760 7761 7762 7763 7764 7765 7766 7767 7768 7769 7770 7771 7772 7773 7774 7775 7776 7777 7778 7779 7780 7781 7782 7783 7784 7785 7786 7787 7788 7789 7790 7791 7792 7793 7794 7795 7796 7797 7798 7799 int fork1(void); // Fork but panics on failure. void panic(char*); struct cmd *parsecmd(char*); // Execute cmd. Never returns. void runcmd(struct cmd *cmd) { int p[2]; struct backcmd *bcmd; struct execcmd *ecmd; struct listcmd *lcmd; struct pipecmd *pcmd; struct redircmd *rcmd; if(cmd == 0) exit(); switch(cmd−>type){ default: panic("runcmd"); case EXEC: ecmd = (struct execcmd*)cmd; if(ecmd−>argv[0] == 0) exit(); exec(ecmd−>argv[0], ecmd−>argv); printf(2, "exec %s failed\n", ecmd−>argv[0]); break; case REDIR: rcmd = (struct redircmd*)cmd; close(rcmd−>fd); if(open(rcmd−>file, rcmd−>mode) < 0){ printf(2, "open %s failed\n", rcmd−>file); exit(); } runcmd(rcmd−>cmd); break; case LIST: lcmd = (struct listcmd*)cmd; if(fork1() == 0) runcmd(lcmd−>left); wait(); runcmd(lcmd−>right); break; Sheet 77 case PIPE: pcmd = (struct pipecmd*)cmd; if(pipe(p) < 0) panic("pipe"); if(fork1() == 0){ close(1); dup(p[1]); close(p[0]); close(p[1]); runcmd(pcmd−>left); } if(fork1() == 0){ close(0); dup(p[0]); close(p[0]); close(p[1]); runcmd(pcmd−>right); } close(p[0]); close(p[1]); wait(); wait(); break; case BACK: bcmd = (struct backcmd*)cmd; if(fork1() == 0) runcmd(bcmd−>cmd); break; } exit(); } int getcmd(char *buf, int nbuf) { printf(2, "$ "); memset(buf, 0, nbuf); gets(buf, nbuf); if(buf[0] == 0) // EOF return −1; return 0; } Sheet 77 Sep 5 23:39 2011 xv6/sh.c Page 4 Sep 5 23:39 2011 xv6/sh.c Page 5 7800 7801 7802 7803 7804 7805 7806 7807 7808 7809 7810 7811 7812 7813 7814 7815 7816 7817 7818 7819 7820 7821 7822 7823 7824 7825 7826 7827 7828 7829 7830 7831 7832 7833 7834 7835 7836 7837 7838 7839 7840 7841 7842 7843 7844 7845 7846 7847 7848 7849 7850 7851 7852 7853 7854 7855 7856 7857 7858 7859 7860 7861 7862 7863 7864 7865 7866 7867 7868 7869 7870 7871 7872 7873 7874 7875 7876 7877 7878 7879 7880 7881 7882 7883 7884 7885 7886 7887 7888 7889 7890 7891 7892 7893 7894 7895 7896 7897 7898 7899 int main(void) { static char buf[100]; int fd; // Assumes three file descriptors open. while((fd = open("console", O_RDWR)) >= 0){ if(fd >= 3){ close(fd); break; } } // Read and run input commands. while(getcmd(buf, sizeof(buf)) >= 0){ if(buf[0] == ’c’ && buf[1] == ’d’ && buf[2] == ’ ’){ // Clumsy but will have to do for now. // Chdir has no effect on the parent if run in the child. buf[strlen(buf)−1] = 0; // chop \n if(chdir(buf+3) < 0) printf(2, "cannot cd %s\n", buf+3); continue; } if(fork1() == 0) runcmd(parsecmd(buf)); wait(); } exit(); } void panic(char *s) { printf(2, "%s\n", s); exit(); } int fork1(void) { int pid; pid = fork(); if(pid == −1) panic("fork"); return pid; } Sheet 78 // Constructors struct cmd* execcmd(void) { struct execcmd *cmd; cmd = malloc(sizeof(*cmd)); memset(cmd, 0, sizeof(*cmd)); cmd−>type = EXEC; return (struct cmd*)cmd; } struct cmd* redircmd(struct cmd *subcmd, char *file, char *efile, int mode, int fd) { struct redircmd *cmd; cmd = malloc(sizeof(*cmd)); memset(cmd, 0, sizeof(*cmd)); cmd−>type = REDIR; cmd−>cmd = subcmd; cmd−>file = file; cmd−>efile = efile; cmd−>mode = mode; cmd−>fd = fd; return (struct cmd*)cmd; } struct cmd* pipecmd(struct cmd *left, struct cmd *right) { struct pipecmd *cmd; cmd = malloc(sizeof(*cmd)); memset(cmd, 0, sizeof(*cmd)); cmd−>type = PIPE; cmd−>left = left; cmd−>right = right; return (struct cmd*)cmd; } Sheet 78 Sep 5 23:39 2011 xv6/sh.c Page 6 Sep 5 23:39 2011 xv6/sh.c Page 7 7900 7901 7902 7903 7904 7905 7906 7907 7908 7909 7910 7911 7912 7913 7914 7915 7916 7917 7918 7919 7920 7921 7922 7923 7924 7925 7926 7927 7928 7929 7930 7931 7932 7933 7934 7935 7936 7937 7938 7939 7940 7941 7942 7943 7944 7945 7946 7947 7948 7949 7950 7951 7952 7953 7954 7955 7956 7957 7958 7959 7960 7961 7962 7963 7964 7965 7966 7967 7968 7969 7970 7971 7972 7973 7974 7975 7976 7977 7978 7979 7980 7981 7982 7983 7984 7985 7986 7987 7988 7989 7990 7991 7992 7993 7994 7995 7996 7997 7998 7999 struct cmd* listcmd(struct cmd *left, struct cmd *right) { struct listcmd *cmd; cmd = malloc(sizeof(*cmd)); memset(cmd, 0, sizeof(*cmd)); cmd−>type = LIST; cmd−>left = left; cmd−>right = right; return (struct cmd*)cmd; } struct cmd* backcmd(struct cmd *subcmd) { struct backcmd *cmd; cmd = malloc(sizeof(*cmd)); memset(cmd, 0, sizeof(*cmd)); cmd−>type = BACK; cmd−>cmd = subcmd; return (struct cmd*)cmd; } Sheet 79 // Parsing char whitespace[] = " \t\r\n\v"; char symbols[] = "<|>&;()"; int gettoken(char **ps, char *es, char **q, char **eq) { char *s; int ret; s = *ps; while(s < es && strchr(whitespace, *s)) s++; if(q) *q = s; ret = *s; switch(*s){ case 0: break; case ’|’: case ’(’: case ’)’: case ’;’: case ’&’: case ’<’: s++; break; case ’>’: s++; if(*s == ’>’){ ret = ’+’; s++; } break; default: ret = ’a’; while(s < es && !strchr(whitespace, *s) && !strchr(symbols, *s)) s++; break; } if(eq) *eq = s; while(s < es && strchr(whitespace, *s)) s++; *ps = s; return ret; } Sheet 79 Sep 5 23:39 2011 xv6/sh.c Page 8 Sep 5 23:39 2011 xv6/sh.c Page 9 8000 8001 8002 8003 8004 8005 8006 8007 8008 8009 8010 8011 8012 8013 8014 8015 8016 8017 8018 8019 8020 8021 8022 8023 8024 8025 8026 8027 8028 8029 8030 8031 8032 8033 8034 8035 8036 8037 8038 8039 8040 8041 8042 8043 8044 8045 8046 8047 8048 8049 8050 8051 8052 8053 8054 8055 8056 8057 8058 8059 8060 8061 8062 8063 8064 8065 8066 8067 8068 8069 8070 8071 8072 8073 8074 8075 8076 8077 8078 8079 8080 8081 8082 8083 8084 8085 8086 8087 8088 8089 8090 8091 8092 8093 8094 8095 8096 8097 8098 8099 int peek(char **ps, char *es, char *toks) { char *s; s = *ps; while(s < es && strchr(whitespace, *s)) s++; *ps = s; return *s && strchr(toks, *s); } struct struct struct struct cmd cmd cmd cmd *parseline(char**, char*); *parsepipe(char**, char*); *parseexec(char**, char*); *nulterminate(struct cmd*); struct cmd* parsecmd(char *s) { char *es; struct cmd *cmd; es = s + strlen(s); cmd = parseline(&s, es); peek(&s, es, ""); if(s != es){ printf(2, "leftovers: %s\n", s); panic("syntax"); } nulterminate(cmd); return cmd; } struct cmd* parseline(char **ps, char *es) { struct cmd *cmd; cmd = parsepipe(ps, es); while(peek(ps, es, "&")){ gettoken(ps, es, 0, 0); cmd = backcmd(cmd); } if(peek(ps, es, ";")){ gettoken(ps, es, 0, 0); cmd = listcmd(cmd, parseline(ps, es)); } return cmd; } Sheet 80 struct cmd* parsepipe(char **ps, char *es) { struct cmd *cmd; cmd = parseexec(ps, es); if(peek(ps, es, "|")){ gettoken(ps, es, 0, 0); cmd = pipecmd(cmd, parsepipe(ps, es)); } return cmd; } struct cmd* parseredirs(struct cmd *cmd, char **ps, char *es) { int tok; char *q, *eq; while(peek(ps, es, "<>")){ tok = gettoken(ps, es, 0, 0); if(gettoken(ps, es, &q, &eq) != ’a’) panic("missing file for redirection"); switch(tok){ case ’<’: cmd = redircmd(cmd, q, eq, O_RDONLY, 0); break; case ’>’: cmd = redircmd(cmd, q, eq, O_WRONLY|O_CREATE, 1); break; case ’+’: // >> cmd = redircmd(cmd, q, eq, O_WRONLY|O_CREATE, 1); break; } } return cmd; } Sheet 80 Sep 5 23:39 2011 xv6/sh.c Page 10 Sep 5 23:39 2011 xv6/sh.c Page 11 8100 8101 8102 8103 8104 8105 8106 8107 8108 8109 8110 8111 8112 8113 8114 8115 8116 8117 8118 8119 8120 8121 8122 8123 8124 8125 8126 8127 8128 8129 8130 8131 8132 8133 8134 8135 8136 8137 8138 8139 8140 8141 8142 8143 8144 8145 8146 8147 8148 8149 8150 8151 8152 8153 8154 8155 8156 8157 8158 8159 8160 8161 8162 8163 8164 8165 8166 8167 8168 8169 8170 8171 8172 8173 8174 8175 8176 8177 8178 8179 8180 8181 8182 8183 8184 8185 8186 8187 8188 8189 8190 8191 8192 8193 8194 8195 8196 8197 8198 8199 struct cmd* parseblock(char **ps, char *es) { struct cmd *cmd; if(!peek(ps, es, "(")) panic("parseblock"); gettoken(ps, es, 0, 0); cmd = parseline(ps, es); if(!peek(ps, es, ")")) panic("syntax − missing )"); gettoken(ps, es, 0, 0); cmd = parseredirs(cmd, ps, es); return cmd; } struct cmd* parseexec(char **ps, char *es) { char *q, *eq; int tok, argc; struct execcmd *cmd; struct cmd *ret; if(peek(ps, es, "(")) return parseblock(ps, es); ret = execcmd(); cmd = (struct execcmd*)ret; argc = 0; ret = parseredirs(ret, ps, es); while(!peek(ps, es, "|)&;")){ if((tok=gettoken(ps, es, &q, &eq)) == 0) break; if(tok != ’a’) panic("syntax"); cmd−>argv[argc] = q; cmd−>eargv[argc] = eq; argc++; if(argc >= MAXARGS) panic("too many args"); ret = parseredirs(ret, ps, es); } cmd−>argv[argc] = 0; cmd−>eargv[argc] = 0; return ret; } Sheet 81 // NUL−terminate all the counted strings. struct cmd* nulterminate(struct cmd *cmd) { int i; struct backcmd *bcmd; struct execcmd *ecmd; struct listcmd *lcmd; struct pipecmd *pcmd; struct redircmd *rcmd; if(cmd == 0) return 0; switch(cmd−>type){ case EXEC: ecmd = (struct execcmd*)cmd; for(i=0; ecmd−>argv[i]; i++) *ecmd−>eargv[i] = 0; break; case REDIR: rcmd = (struct redircmd*)cmd; nulterminate(rcmd−>cmd); *rcmd−>efile = 0; break; case PIPE: pcmd = (struct pipecmd*)cmd; nulterminate(pcmd−>left); nulterminate(pcmd−>right); break; case LIST: lcmd = (struct listcmd*)cmd; nulterminate(lcmd−>left); nulterminate(lcmd−>right); break; case BACK: bcmd = (struct backcmd*)cmd; nulterminate(bcmd−>cmd); break; } return cmd; } Sheet 81 Sep 5 23:39 2011 xv6/bootasm.S Page 1 Sep 5 23:39 2011 xv6/bootasm.S Page 2 8200 8201 8202 8203 8204 8205 8206 8207 8208 8209 8210 8211 8212 8213 8214 8215 8216 8217 8218 8219 8220 8221 8222 8223 8224 8225 8226 8227 8228 8229 8230 8231 8232 8233 8234 8235 8236 8237 8238 8239 8240 8241 8242 8243 8244 8245 8246 8247 8248 8249 8250 8251 8252 8253 8254 8255 8256 8257 8258 8259 8260 8261 8262 8263 8264 8265 8266 8267 8268 8269 8270 8271 8272 8273 8274 8275 8276 8277 8278 8279 8280 8281 8282 8283 8284 8285 8286 8287 8288 8289 8290 8291 8292 8293 8294 8295 8296 8297 8298 8299 #include "asm.h" #include "memlayout.h" #include "mmu.h" # # # # Start the first CPU: switch to 32−bit protected mode, jump into C. The BIOS loads this code from the first sector of the hard disk into memory at physical address 0x7c00 and starts executing in real mode with %cs=0 %ip=7c00. .code16 .globl start start: cli # Assemble for 16−bit mode # BIOS enabled interrupts; disable # Set up the important data segment registers (DS, ES, SS). xorw %ax,%ax # Segment number zero movw %ax,%ds # −> Data Segment movw %ax,%es # −> Extra Segment movw %ax,%ss # −> Stack Segment # Physical address line A20 is tied to zero so that the first PCs # with 2 MB would run software that assumed 1 MB. Undo that. seta20.1: inb $0x64,%al # Wait for not busy testb $0x2,%al jnz seta20.1 movb outb $0xd1,%al %al,$0x64 seta20.2: inb $0x64,%al testb $0x2,%al jnz seta20.2 movb outb $0xdf,%al %al,$0x60 # 0xd1 −> port 0x64 # Wait for not busy # 0xdf −> port 0x60 # Switch from real to protected mode. Use a bootstrap GDT that makes # virtual addresses map dierctly to physical addresses so that the # effective memory map doesn’t change during the transition. lgdt gdtdesc movl %cr0, %eax orl $CR0_PE, %eax movl %eax, %cr0 Sheet 82 # Complete transition to 32−bit protected mode by using long jmp # to reload %cs and %eip. The segment descriptors are set up with no # translation, so that the mapping is still the identity mapping. ljmp $(SEG_KCODE<<3), $start32 .code32 # Tell assembler to generate 32−bit code now. start32: # Set up the protected−mode data segment registers movw $(SEG_KDATA<<3), %ax # Our data segment selector movw %ax, %ds # −> DS: Data Segment movw %ax, %es # −> ES: Extra Segment movw %ax, %ss # −> SS: Stack Segment movw $0, %ax # Zero segments not ready for use movw %ax, %fs # −> FS movw %ax, %gs # −> GS # Set up the stack pointer and call into C. movl $start, %esp call bootmain # If bootmain returns (it shouldn’t), trigger a Bochs # breakpoint if running under Bochs, then loop. movw $0x8a00, %ax # 0x8a00 −> port 0x8a00 movw %ax, %dx outw %ax, %dx movw $0x8ae0, %ax # 0x8ae0 −> port 0x8a00 outw %ax, %dx spin: jmp spin # Bootstrap GDT .p2align 2 gdt: SEG_NULLASM SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) SEG_ASM(STA_W, 0x0, 0xffffffff) gdtdesc: .word (gdtdesc − gdt − 1) .long gdt Sheet 82 # force 4 byte alignment # null seg # code seg # data seg # sizeof(gdt) − 1 # address gdt Sep 5 23:39 2011 xv6/bootmain.c Page 1 Sep 5 23:39 2011 xv6/bootmain.c Page 2 8300 8301 8302 8303 8304 8305 8306 8307 8308 8309 8310 8311 8312 8313 8314 8315 8316 8317 8318 8319 8320 8321 8322 8323 8324 8325 8326 8327 8328 8329 8330 8331 8332 8333 8334 8335 8336 8337 8338 8339 8340 8341 8342 8343 8344 8345 8346 8347 8348 8349 8350 8351 8352 8353 8354 8355 8356 8357 8358 8359 8360 8361 8362 8363 8364 8365 8366 8367 8368 8369 8370 8371 8372 8373 8374 8375 8376 8377 8378 8379 8380 8381 8382 8383 8384 8385 8386 8387 8388 8389 8390 8391 8392 8393 8394 8395 8396 8397 8398 8399 // // // // // // Boot loader. Part of the boot sector, along with bootasm.S, which calls bootmain(). bootasm.S has put the processor into protected 32−bit mode. bootmain() loads an ELF kernel image from the disk starting at sector 1 and then jumps to the kernel entry routine. #include #include #include #include "types.h" "elf.h" "x86.h" "memlayout.h" #define SECTSIZE 512 void readseg(uchar*, uint, uint); void bootmain(void) { struct elfhdr *elf; struct proghdr *ph, *eph; void (*entry)(void); uchar* pa; elf = (struct elfhdr*)0x10000; // scratch space // Read 1st page off disk readseg((uchar*)elf, 4096, 0); // Is this an ELF executable? if(elf−>magic != ELF_MAGIC) return; // let bootasm.S handle error // Load each program segment (ignores ph flags). ph = (struct proghdr*)((uchar*)elf + elf−>phoff); eph = ph + elf−>phnum; for(; ph < eph; ph++){ pa = (uchar*)ph−>paddr; readseg(pa, ph−>filesz, ph−>off); if(ph−>memsz > ph−>filesz) stosb(pa + ph−>filesz, 0, ph−>memsz − ph−>filesz); } // Call the entry point from the ELF header. // Does not return! entry = (void(*)(void))(elf−>entry); entry(); } Sheet 83 void waitdisk(void) { // Wait for disk ready. while((inb(0x1F7) & 0xC0) != 0x40) ; } // Read a single sector at offset into dst. void readsect(void *dst, uint offset) { // Issue command. waitdisk(); outb(0x1F2, 1); // count = 1 outb(0x1F3, offset); outb(0x1F4, offset >> 8); outb(0x1F5, offset >> 16); outb(0x1F6, (offset >> 24) | 0xE0); outb(0x1F7, 0x20); // cmd 0x20 − read sectors // Read data. waitdisk(); insl(0x1F0, dst, SECTSIZE/4); } // Read ’count’ bytes at ’offset’ from kernel into physical address ’pa’. // Might copy more than asked. void readseg(uchar* pa, uint count, uint offset) { uchar* epa; epa = pa + count; // Round down to sector boundary. pa −= offset % SECTSIZE; // Translate from bytes to sectors; kernel starts at sector 1. offset = (offset / SECTSIZE) + 1; // If this is too slow, we could read lots of sectors at a time. // We’d write more to memory than asked, but it doesn’t matter −− // we load in increasing order. for(; pa < epa; pa += SECTSIZE, offset++) readsect(pa, offset); } Sheet 83