<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wikidot="http://www.wikidot.com/rss-namespace">

	<channel>
		<title>General (new threads)</title>
		<link>http://rl-tau-2019.wikidot.com/forum/c-5177828/general</link>
		<description>Threads in the forum category &quot;General&quot;</description>
				<copyright></copyright>
		<lastBuildDate>Sat, 16 May 2026 05:08:08 +0000</lastBuildDate>
		
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12225140</guid>
				<title>Exam solution</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12225140/exam-solution</link>
				<description></description>
				<pubDate>Mon, 15 Jul 2019 15:34:23 +0000</pubDate>
				<wikidot:authorName>guest</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hello,<br /> can you please upload a solution for the exam?</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12202168</guid>
				<title>Exam Moed A Solution</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12202168/exam-moed-a-solution</link>
				<description></description>
				<pubDate>Tue, 09 Jul 2019 18:11:51 +0000</pubDate>
				<wikidot:authorName>Guest</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,</p> <p>Can the Moed A exam and its solution be uploaded?</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12197422</guid>
				<title>Controllability and stability</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12197422/controllability-and-stability</link>
				<description></description>
				<pubDate>Mon, 08 Jul 2019 08:35:43 +0000</pubDate>
				<wikidot:authorName>asdf</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,</p> <p>In the LQR lecture we defined controllability as a sufficient condition for solving the ARE equations.<br /> Then we defined stability which basically tell us if our system will explode or not depending on the eigenvalues of the proposed optimal solution.<br /> Can someone explain how are the two related ?<br /> We can reach every state but then cannot stay there? we will try to reach it but the system will be very unstable?<br /> Also it says that a good system is a system where the eigenvalues are lower than 1 hence x_t goes to 0, why is it good?<br /> We want x_t to be a specific state and not zero.</p> <p>Thanks!</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12195036</guid>
				<title>Off/on policy evaluation in exam</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12195036/off-on-policy-evaluation-in-exam</link>
				<description></description>
				<pubDate>Sun, 07 Jul 2019 13:59:21 +0000</pubDate>
				<wikidot:authorName>gsdaf</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,</p> <p>In the exams you published there are questions that provide traces and ask us to compute the V or Q function via some method.<br /> My question is, how do we know if the traces were produced via on-policy or by off-policy?<br /> This changes dramatically the computation of the estimated Q/V function.</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12194062</guid>
				<title>small question/clarification on recitation 5</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12194062/small-question-clarification-on-recitation-5</link>
				<description></description>
				<pubDate>Sun, 07 Jul 2019 07:38:46 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>What does it mean &quot;s(1,1) - action is chosen at step 4 &amp; 5&quot; on page 5-6, rec. 5?</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12191649</guid>
				<title>HW solutions</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12191649/hw-solutions</link>
				<description></description>
				<pubDate>Sat, 06 Jul 2019 12:55:05 +0000</pubDate>
				<wikidot:authorName>adsavf</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Can we please get a solution to the homework.</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12190788</guid>
				<title>recitation 5, ex.1</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12190788/recitation-5-ex-1</link>
				<description></description>
				<pubDate>Sat, 06 Jul 2019 08:36:38 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Rec. 6, ex.1</p> <p>When do we update the entry Q(5,5)?<br /> Since it is the target (room 5), it seems it can be updated only when the episod is starting with that state?<br /> If it would have been stayed zero, we would have never reached Q values greater than 100</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12190751</guid>
				<title>UCB analysis exploration cost</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12190751/ucb-analysis-exploration-cost</link>
				<description></description>
				<pubDate>Sat, 06 Jul 2019 08:24:30 +0000</pubDate>
				<wikidot:authorName>recitation 10</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>In the analysis of the UCB bound there is an assumption that Ti is greater than 1.<br /> It holds since in the first round we start by pulling each arm one time.<br /> Shouldn't we add this to the regret?<br /> Hence the regret should have an extra term: sum i 1 to n of delta_i</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12188079</guid>
				<title>recitation 10</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12188079/recitation-10</link>
				<description></description>
				<pubDate>Fri, 05 Jul 2019 12:30:10 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Shalom Lee,</p> <p>In equation 8, shouldn't we write t instead of t square? indeed, in the last expression, the t square is replaced by t.</p> <p>in the explanation before (16), delta(i) = mu1 - mu2</p> <p>in equation 22, should we assume alpha is greather than 1?</p> <p>Toda,<br /> Rafi</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12187375</guid>
				<title>Recitation 8 - new weights vector, why?</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12187375/recitation-8-new-weights-vector-why</link>
				<description></description>
				<pubDate>Fri, 05 Jul 2019 07:57:27 +0000</pubDate>
				<wikidot:authorName>Just asking</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>In recitation 8 in the second part we initialize a new weight vector for the state &quot;wait&quot;.<br /> The entire point of function approximation was to have a compact representation of the state space, hence if we initialize a weight vector for each state in the space<br /> then we are doomed, no?<br /> Why do we need a new weight vector? why is it reasonable?</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12187353</guid>
				<title>Recitation 8 - gradient direction</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12187353/recitation-8-gradient-direction</link>
				<description></description>
				<pubDate>Fri, 05 Jul 2019 07:49:37 +0000</pubDate>
				<wikidot:authorName>just asking</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Correct me if I am wrong but should the updates the weights of the Q function in the direction of the negative gradient.<br /> it seems that we add the gradient to the weights instead of subtracting it.</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12185849</guid>
				<title>recitation 13 (exam solution) and recitation 11 minor fixes</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12185849/recitation-13-exam-solution-and-recitation-11-minor-fixes</link>
				<description></description>
				<pubDate>Thu, 04 Jul 2019 19:13:09 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hello Lee,</p> <p>recitation 13, 13-1 part a - there is no 'return' action from state 1 to itself, index should start from 2.</p> <p>recitation 11, page 11-4 , the decreasing curve is open-left and the increasing curve is open right, the x-axis is b(s(l)) state (tiger on left door).</p> <p>page 11-5, the second equation : 0.15 + 0.7 * b(s(l))</p> <p>Toda<br /> Rafi</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12185095</guid>
				<title>Exam</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12185095/exam</link>
				<description></description>
				<pubDate>Thu, 04 Jul 2019 13:49:01 +0000</pubDate>
				<wikidot:authorName>Just asking</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,</p> <p>For those of us that did not attend the last class/recitation is there any special remarks we need to know?<br /> Can we bring formula pages ?</p> <p>Any information would be great,<br /> Thanks!</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12184037</guid>
				<title>minor fixes recitation 7</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12184037/minor-fixes-recitation-7</link>
				<description></description>
				<pubDate>Thu, 04 Jul 2019 07:24:15 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Shalom Lee,</p> <p>in page 1-1, 1-2, 1-5 - mainly a matter of notation and being consistent with the class, the index of R/r should start from t and not from t + 1</p> <p>in page 1-3, last equation at the bottom, in the second expression there is i believe a missing gamma</p> <p>1-4, 1-5, perhaps it's worth mentioning as a footnote or something&#8230; V(lambda, t) is the same as G (lambda, t) (as presented in the first page) and V(n, t) is the same as G(n, t)</p> <p>thanks<br /> Rafi</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12174112</guid>
				<title>minor fixes in recitation 9</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12174112/minor-fixes-in-recitation-9</link>
				<description></description>
				<pubDate>Tue, 02 Jul 2019 15:39:40 +0000</pubDate>
				<wikidot:authorName>rafi levy</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi Lee,</p> <p>in 5-6 , mu is a fuction of teta and not of sigma ; in the second gradient, it should be written deviation (sigma) and not average</p> <p>thanks<br /> Rafi</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12171007</guid>
				<title>Lecture 9 - Finite Differences Methods</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12171007/lecture-9-finite-differences-methods</link>
				<description></description>
				<pubDate>Tue, 02 Jul 2019 04:21:28 +0000</pubDate>
				<wikidot:authorName>guest01</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,<br /> In the part about finite differences methods (scribe 9, page 4) it is said in the last paragraph that we do not have the value of J(theta).<br /> Can you please explain why it is possible to obtain the values of z which are J(theta + delta * u_i) but not the value of J(theta)?</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-12122817</guid>
				<title>Policy iteration</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-12122817/policy-iteration</link>
				<description></description>
				<pubDate>Sat, 22 Jun 2019 12:48:22 +0000</pubDate>
				<wikidot:authorName>Amir</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Hi,<br /> In lecture 4 - slide 57 it is said that policy iteration takes O(|A|*|S|^2 + |S|^3).<br /> Can you please explain why that is the case? what is the step that take |A|*|S|^2 and what is the part that takes |S|^3?</p> <p>Thanks!</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-11872026</guid>
				<title>Alternative ways to work with GPU</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-11872026/alternative-ways-to-work-with-gpu</link>
				<description></description>
				<pubDate>Sat, 18 May 2019 12:56:47 +0000</pubDate>
				<wikidot:authorName>Vered Zilberstein</wikidot:authorName>				<wikidot:authorUserId>5248283</wikidot:authorUserId>				<content:encoded>
					<![CDATA[
						 <p>Hi all,<br /> I have 2 suggested ways for you:</p> <p><span style="text-decoration: underline;">1. Using google colab</span><br /> You can copy my colab notebook: <a href="http://www.example.com">https://colab.research.google.com/drive/1JRS6xTvBKGL74mJP6wyj1RlBhU7-4A9O</a> and transfer your code to the new notebook.<br /> It requires adjusting the code and takes some babysitting (it only lasts 12H) but it's free available GPU to get you started quickly.</p> <p><span style="text-decoration: underline;">2. Using virtual environment on another server</span><br /> Some of you may have access to more storage on the university's machine, this is for you (since it takes about 1.5gb and regular disk quota is 1gb).<br /> Just follow the instructions below:</p> <ul> <li>ssh savant/rack-gamir-g04/5/6 (from nova)</li> <li>bash</li> <li>cd &lt;directory for your env&gt;</li> <li># create the virtual environment once</li> <li>virtualenv -p /usr/local/lib/anaconda3-5.1.0/bin/python dqn_env</li> <li># NOTE: you need to activate venv each time before running</li> <li>source dqn_env/bin/activate</li> </ul> <ul> <li># install all of the requirements</li> <li>pip install &quot;gym[atari]&quot;==0.9.5</li> <li>pip install opencv-python</li> <li>pip install torch</li> <li>pip install matplotlib</li> </ul> <ul> <li># run project from within the environment</li> <li>python main.py</li> </ul> <ul> <li># sanity check from within the environment</li> <li>python</li> <li>import gym</li> <li>print(gym.<span style="text-decoration: underline;">version</span>) # make sure it says 0.9.5</li> <li>gym.make('PongNoFrameskip-v4') # make sure this command succeeds</li> </ul> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-11842786</guid>
				<title>Failed running final project on university&#039;s server</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-11842786/failed-running-final-project-on-university-s-server</link>
				<description></description>
				<pubDate>Fri, 17 May 2019 09:58:52 +0000</pubDate>
				<wikidot:authorName>lior</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>Me and other students have encountered various problems running the main.py and ram.py on the university's servers. We have seen that on another thread you mentioned that the pip install command is not suitable for these servers (rack-gamir-g0[4,5,6] and savant), thus we would appreciate it if you could specify what commands we should run on these servers in order to run the project.</p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://rl-tau-2019.wikidot.com/forum/t-11714391</guid>
				<title>Final Project ffmpeg</title>
				<link>http://rl-tau-2019.wikidot.com/forum/t-11714391/final-project-ffmpeg</link>
				<description></description>
				<pubDate>Sun, 12 May 2019 16:57:17 +0000</pubDate>
				<wikidot:authorName>Amit</wikidot:authorName>								<content:encoded>
					<![CDATA[
						 <p>I am having problems with the ffmpeg (the atari does not recognize it).<br /> You wrote in the project that to install ffmpeg we can use homebrew or apt-get.<br /> the problem is that I use windows and both of those options (homebrew and apat-get) are only for linux (If I understood correctly).</p> <p>I tried unsuccessfully to install ffmpeg from some other places I found online.<br /> is there an option not to use ffmpeg? what would you recommend to do in my case with windows?</p> <p>Thanks</p> 
				 	]]>
				</content:encoded>							</item>
				</channel>
</rss>